Varakin Klemes Porter 2013 QJEP pre

36
Running Head: Time & Scene Perception The Effect of Scene Structure on Time Perception D. Alexander Varakin Keith J. Klemes Kennetha A. Porter Eastern Kentucky University This manuscript was published in the Quarterly Journal of Experimental Psychology, vol 66, issue #8, 2013. Visit QJEPs website for the penultimate version of this paper. Please address correspondence concerning this article to the first author at: Department of Psychology, 127 Cammack Building, 521 Lancaster Ave., Richmond, KY, USA 40475. Telephone: 1 (859) 622 - 2511. e-mail: [email protected]

Transcript of Varakin Klemes Porter 2013 QJEP pre

Running Head: Time & Scene Perception

The Effect of Scene Structure on Time Perception

D. Alexander Varakin

Keith J. Klemes

Kennetha A. Porter

Eastern Kentucky University

This manuscript was published in the Quarterly Journal of Experimental Psychology, vol 66,

issue #8, 2013. Visit QJEP’s website for the penultimate version of this paper.

Please address correspondence concerning this article to the first author at: Department of

Psychology, 127 Cammack Building, 521 Lancaster Ave., Richmond, KY, USA 40475. Telephone: 1

(859) 622 - 2511. e-mail: [email protected]

TIME & SCENE PERCEPTION 2

Acknowledgements

The authors would like to thanks Catherine Clement, Jonathan Gore and two anonymous

reviewers for commenting on previous drafts of the paper, and Jerry Palmer and Yong Wang for

helpful statistical advice.

TIME & SCENE PERCEPTION 3

Abstract

The current experiments examined the hypothesis that scene structure affects time perception. In

three experiments, participants judged the duration of realistic scenes that were presented in a

normal or jumbled (i.e. incoherent) format. Experiment 1 demonstrated that the subjective

duration of normal scenes was greater than subjective duration of jumbled scenes. In Experiment

2, gridlines were added to both normal and jumbled scenes to control for the number of line

terminators, and scene structure had no effect. In Experiment 3, participants performed a

secondary task that required paying attention to scene structure, and scene structure’s effect on

duration judgments reemerged. These findings are consistent with the idea that perceived

duration can depend on visual-cognitive processing, which in turn depends on both the nature of

the stimulus and the goals of the observer.

TIME & SCENE PERCEPTION 4

The Effect of Scene Structure on Time Perception

The question of how people perceive and represent realistic scenes is an important

research topic in perceptual and cognitive sciences. Henderson and Hollingworth (1999) define

scenes as semantically coherent (and often nameable) views of human-scaled environments that

include background elements (e.g. walls, floors, the sky etc.), support surfaces (e.g. tables, chairs

etc.) and discrete moveable objects (e.g. coffee cups) that are arranged in a spatially appropriate

manner. Despite their complexity, realistic scenes are processed remarkably fast. For example,

it takes about 100ms to determine the “gist” of a scene and about 400ms to store scene

information in memory (Potter, 1975, 1976). While there are many experiments examining how

scene processing unfolds over time, to the best of our knowledge, there have been no

experiments examining how scene perception affects the experience of time. This gap in the

literature is somewhat surprising because scene perception research is driven largely by the

desire to use stimuli (i.e. realistic scenes) that are more representative of what people experience

in everyday life, and time is an inescapable aspect of everyday life

Although not yet studied in the context of scene perception, the study of how people

perceive brief intervals of time (in 100s of milliseconds to seconds range) is a relatively popular

topic in cognitive science and related disciplines (e.g. Droit-Volet & Meck, 2007; Fraisse, 1984;

Ivry & Schlerf, 2008). One of the most robust findings to come out of this work is that

perception of time depends, in part, on the properties of the visual stimulus being viewed.

However, research on how visual features affect perceived time has tended to use simple stimuli,

such as lines or abstract geometric shapes that vary in size (Ono & Kawahara, 2007; Xuan,

TIME & SCENE PERCEPTION 5

Zhang, He, & Chen, 2007), brightness (Xuan, et al., 2007) or whether they should capture an

observer’s attention (New & Scholl, 2009; Tse, Intriligator, Rivest, & Cavanagh, 2004).

Realistic pictures have been used in research on time perception, but this research has typically

addressed issues about how emotional content (Droit-Volet & Meck, 2007) or repetition (e.g.

Pariyadath and Eagleman, 2007) affects time perception. Words have also been used as stimuli

(e.g. Reingold & Merikle, 1988), and although words are common stimuli that people see every

day, they are processed differently than realistic pictures (e.g. words allow easier access to

lexical information, Potter & Faulconer, 1975). In summary, the literature on time perception is

extensive, but there has been little systematic investigation of how the structure of realistic

scenes affects time perception.

However, there has been research investigating how event structure affects perception of

(Brown & Boltz, 2002; Liverence & Scholl, 1012) or memory for (Boltz, 1995; 2005) duration.

Event coherence is defined in terms the extent to which the temporal and non-temporal

information available in a stimulus co-defines one another (Boltz, 1995). In a melody, for

example, a given note can be accented by either temporal information (e.g. position in a

succession of notes) or non-temporal information (e.g. volume), and when these two sources of

information accent the same notes (e.g. every 4th

note is louder than other notes), then a melody

would be coherent. In addition to music, prose passages (Brown & Boltz, 2002), naturalistic

videos (Boltz, 2005) and simple animations of moving objects (Liverence & Scholl, 2012) have

also been used as events. Several experiments suggest that coherent events are perceived as

lasting longer than incoherent events (Brown & Boltz, 2002; Liverence & Scholl, 2012), one

explanation being that incoherent events disrupt the ability to process temporal information

(Brown & Boltz, 2002). The current experiments examine a similar question as this past work,

TIME & SCENE PERCEPTION 6

in asking how the structure of a stimulus affects time perception. However, whereas past work

examined the influence of spatio-temporal coherence on time perception, the current experiments

examine the influence of spatial structure on time perception.

In the current context, the term structure refers to the visual features of a scene that

define spatial layout, including the relative location of objects, support surfaces, and other

background elements (c.f. Varakin & Levin, 2008). The current experiments specifically tested

if jumbling scene structure affects time perception (see Figure 1a and 1b for an example of a

normal scene and a jumbled scene). Jumbling preserves some of the low-level properties (e.g.

overall luminance and chromaticity) and many (though not all) of the local details of a scene, but

disrupts the global spatial layout. Thus, the local details and the global layout no longer co-

define one another. Previous scene perception research suggests that jumbling scene structure

interferes with various aspects of scene processing, including processes related to finding and

identifying objects (Biederman, 1972; Biederman, Glass & Stacy, 1973; Biederman, Rabinowitz,

Glass, & Stacy, 1974; Foulsham, Alan, & Kingstone, 2011), detecting changes (Varakin &

Levin, 2008; Yokosawa & Mitsumatsu, 2003), and judging spatial relationships (Sanocki,

Michelet, Sellers, & Reynolds, 2006).

Should jumbling scene structure affect time perception as well? The literature on event

structure and time perception (e.g. Brown & Boltz, 2002; Liverence & Scholl, 2012) would seem

to predict that degrading structure should lead to time compression, but since spatio-temporal

incoherence does not imply spatial incoherence, it is not immediately clear whether jumbling

scene structure will have similar effects on perceived time. However, the literature on how static

visual features affect time perception lends further support to the prediction that normal scenes

TIME & SCENE PERCEPTION 7

should seem to last longer than jumbled scenes. That is, given equal objective durations, the

subjective duration of normal scenes should be greater than that of jumbled scenes. One idea

from the literature on time perception that led us to this prediction is the hypothesis that duration

judgments are based on how much perceptual information is processed during an interval (Tse et

al., 2004). Supporting evidence for this hypothesis comes from experiments demonstrating that

objects that capture attention seem to last longer than unattended objects (e.g. New & Scholl,

2009; Pariyadath & Eagleman, 2007; Tse, et al. 2004). Presumably, attending to a stimulus

increases the amount of information processed per unit of objective time thereby increasing the

subjective duration of attended stimuli relative to unattended stimuli (e.g. Tse et al., 2004).

Previous research on scene perception is consistent with the idea that jumbling a scene reduces

how quickly useful information can be obtained from it (Biederman et al., 1973; Biederman et

al., 1974; Foulsham et al., 2011; Sanocki et al., 2006; Varakin & Levin, 2008). Thus, if duration

judgments are based on how much information is processed during an interval, it follows that

normal scenes should seem to last longer than jumbled scenes.

Another hypothesis consistent with our prediction is that magnitudes are represented

using a general magnitude system (Walsh, 2003). In this context, the term magnitude refers to

anything that can be represented as a quantity of some kind, for example, length, width, size, or

time. If the resources for representing different kinds of magnitudes overlap, then manipulations

that affect perception of one kind of magnitude should also affect perception of other

magnitudes. Indeed, experiments using simple stimuli have demonstrated that perceived

duration increases as length (Casananto & Boroditsky, 2008; Jones & Huang, 1982; Lebensfield

& Wapner, 1968) or size (Ono & Kawahara, 2007; Xuan, et al., 2007) increases. There is

evidence that individual sections of jumbled scenes are functionally independent (Sanocki et al.,

TIME & SCENE PERCEPTION 8

2006), and since the line segments and overall spatial extents within sections of jumbled scenes

tend to be smaller in magnitude than the line segments and spatial extents of normal scenes, it

follows that the subjective duration of jumbled scenes should be less than that of normal scenes.

In summary, research on scene perception has (to the best of our knowledge) ignored the

question of how scene structure affects time perception. Past research on time perception

unequivocally supports the notion that perception of time is malleable, and leads to the

prediction that scene structure should affect duration judgments: normal scenes should seem to

last longer than jumbled scenes.

Experiment 1

Experiment 1 tested the hypothesis that the subjective duration of normal scenes would

be greater than that of jumbled scenes using the temporal bisection task. Observers were asked

to judge whether briefly presented scenes were closer in duration to a pre-learned short or long

standard duration. If scene structure affects time perception, then normal scenes should reliably

elicit “long” responses at shorter durations than jumbled scenes.

Method

Participants. Nineteen students (mean age = 19.4, 6 males) at Eastern Kentucky

University participated in exchange for course credit. Two participants’ data were discarded

because the proportion of “long” responses was about the same for all durations (i.e. on the

shortest and longest durations, the proportion of long responses was about .40). All participants

reported normal or corrected vision (in this and all subsequent experiments reported here).

TIME & SCENE PERCEPTION 9

Stimuli and Apparatus. Stimuli were 49 normal scenes and their jumbled counterparts

(see Figure 1a and 1b) from Experiment 1 of Varakin and Levin (2008)1. Each scene was sized

to 640 x 480 pixels. Stimuli were presented on a white background using iMac computers with

21.5-inch (diagonal) wide-screen LED-backlit monitors set at a resolution of 1680 x 1050.

PsyScope (Cohen, MacWhinney, Flatt & Provost, 1993) controlled stimulus presentation and

recorded responses. Viewing distance was not controlled.

Procedure. A temporal bisection task was used that consisted of a training phase and a

test phase. During the training phase, participants learned the short and long standard durations.

On each trial in the training phase, a 640 x 480 white box with a black border was presented for

either 400 ms (the short standard) or 1600 ms (the long standard). A response prompt appeared

after the box disappeared. Ten participants pressed “k” for the long standard, and “d” for the

short standard, and this response mapping was reversed for the remaining nine participants.

Feedback, in the form of a high pitched beep, was given after incorrect responses. Participants

performed 20 total training trials, 10 for each standard. Order was randomized.

On each trial in the test phase, a scene was presented for 400, 600, 800, 1000, 1200, 1400

or 1600 ms (determined randomly on each trial). The participants’ task was to judge whether the

scene’s duration was closer to the long or short standard, using the response mapping learned

during training. After each scene disappeared, a prompt appeared on the screen that read “press

d for short, press k for long” or “press d for long, press k for short”, depending on which

response mapping was learned in the test phase. There were 196 test trials (49 normal scenes

and their 49 jumbled counterparts each presented twice). Order of presentation was randomized.

No feedback was given in the test phase.

TIME & SCENE PERCEPTION 10

Results and Discussion

Average proportion of long responses as a function of stimulus duration is plotted in

Figure 2. Data were analyzed by calculating separate bisection points for each scene type

(normal vs. jumbled) for each participant’s individual data. A bisection point is the point in the

psychometric function relating actual duration to the proportion of long-responses at which 50%

responses are predicted to be “long”. These were calculated with a logit function (using PASW

18). Individual curves were fit for each participant’s individual data (separately for normal and

jumbled scenes), and the averages of the resulting bisection points were then calculated for each

scene type. As subjective duration increases, bisection points should decrease because long

responses should be reliably elicited at shorter durations. Consistent with the idea that the

subjective duration of normal scenes is greater than the subjective duration of jumbled scenes,

the average bisection point for normal scenes (986ms, SD = 162) was smaller than the average

bisection point for jumbled scenes (1070ms, SD = 260), t (16) = 2.33, p < .05, d = .39.

These results support the hypothesis that normal scenes seem to last longer than jumbled

scenes. However, there is one confound that should be addressed before we can conclude that it

was the disruption of scene structure that led to the difference in subjective duration. Jumbled

scenes contain a greater number of line terminators than normal scenes (compare Figures 1a and

1b). It is possible that the artificial perceptual noise created by the line terminators is the cause

of the difference in subjective duration between normal and jumbled scenes, and not scene

structure per se. Experiment 2 addresses this possibility.

Experiment 2

TIME & SCENE PERCEPTION 11

Experiment 2 was an attempt to replicate Experiment 1, while better controlling for low-

level feature differences between normal and jumbled scenes. In much of the research

examining the effects of jumbling on scene perception, gridlines are superimposed over both

normal and jumbled scenes, making normal scenes appear as if they were being viewed through

window panes (as in the current Figures 3a and 3b). The gridlines effectively equate the number

of line terminators contained in normal and jumbled versions of scene pairs. Since most of the

research comparing normal and jumbled scenes has utilized gridlines in order to equate the

number of terminating lines, we expected to replicate the results of Experiment 1.

Method

The method in Experiment 2 was identical to Experiment 1 except as noted below.

Participants. Twenty-six students (mean age = 20.7 years, 8 males) at Eastern Kentucky

University participated for course credit. None of the participants participated in Experiment 1.

Stimuli and Apparatus. The scenes were identical to those from Experiment 1, except a

grid was superimposed over each image (see Figure 3). Each line segment of the grid was 2

pixels thick. The equipment was the same. SuperLab 4.0 (Cedrus Inc., San Pedro, CA, USA)

controlled stimulus presentation and recorded responses.

Results and Discussion

The average proportion of long responses as a function of stimulus duration is plotted in

Figure 4. Data were analyzed as in Experiment 1. The average bisection points for normal

scenes (988ms, SD = 177) and for jumbled scenes (1008ms, SD = 145), were statistically

identical t (25) < 1. Therefore, these results failed to replicate Experiment 1. The subjective

TIME & SCENE PERCEPTION 12

duration of normal and jumbled scenes was statistically equivalent when gridlines were added to

the scenes. This result is surprising because in previous work gridlines did not eliminate

differences between normal and jumbled scenes (Exp. 2, Varakin & Levin, 2008). However, the

tasks used in previous work required participants to pay attention to the visual features of the

scene, but the current task did not. Since scene structure was irrelevant to the task, participants

may have paid little attention to scene structure, perhaps because the (relatively) high-contrast

horizontal and vertical lines of the grid were easier to perceive (as in the oblique effect, Appelle,

1972, but also see Hansen & Essock, 2003).

Experiment 3

In Experiment 1, the subjective duration of normal scenes was greater than the subjective

duration of jumbled scenes. However, this effect was eliminated in Experiment 2, when gridlines

were added to the scenes. One possible explanation for these results is that participants were less

likely to pay attention to scene structure when gridlines were present. If that is the case, then if

participants are required to pay attention to scene structure, the effect of scene structure on

duration judgments should reemerge, even when gridlines are present. Experiment 3 addressed

this possibility using the same scenes and temporal bisection task as Experiment 2. However,

the temporal bisection task was performed on only half of the trials in the test phase. On the

other half, participants classified scenes as either normal or jumbled. Participants did not know

which judgment (duration or scene structure) would be required until the end of the trial, so on

each trial participants had to pay attention to the scene’s structure and its duration.

Method

TIME & SCENE PERCEPTION 13

The method in Experiment 3 was identical to Experiment 2 except as noted below.

Participants. Seventeen students (mean age = 20.9 years, 8 males) at Eastern Kentucky

University participated in exchange for course credit. One participant’s data were dropped for

giving “long” responses about on about 50% of trials, regardless of actual duration. None of the

participants had participated in Experiment 1 or 2.

Procedure. As in the previous experiments, there was a training phase and a test phase.

The training phase was almost identical to the training phase in Experiments 1 and 2 (i.e. the

short standard was 400ms, the long standard was 1600ms). The only difference was the response

keys. In Experiment 3, all participants responded using the number pad on the right side of the

key board, pressing “4” for the short standard and “6” for the long standard. (Response mapping

for duration judgments was not counterbalanced between participants in this experiment because

none of the effects involving this factor approached significance in the previous experiments).

The number pad was selected as the response mechanism because it enabled participants to make

all responses with one hand, using the “8” and “2” (which are aligned orthogonally to the “4”

and “6” keys) to classify scenes as normal or jumbled.

On each trial in the test phase, a scene was presented for 400, 600, 800, 1000, 1200, 1400

or 1600 ms (determined randomly on each trial). After the scene disappeared, a response prompt

appeared. On duration judgment trials, the prompt read “press 4 for short, press 6 for long”. On

scene judgment trials, the prompt read “press 8 for normal, press 2 for jumbled” for 9

participants, and “press 8 for jumbled, press 2 for normal” for remaining participants. There

were a total of 196 duration judgment trials (49 normal and their jumbled counterparts each

presented twice), and 196 scene judgment trials (49 normal and their jumbled counterparts each

TIME & SCENE PERCEPTION 14

presented twice). Tasks were presented in random order (randomized separately for each

participant), thus, participants did not know which response would be relevant until the scene

had disappeared.

Results

Performance on the scene judgment task was used to ensure that participants were paying

attention to scene structure. Overall accuracy on the scene judgment task was 94%. Thus,

participants were clearly paying attention to the structure of the scenes.

The average proportion of long responses as a function of stimulus duration is plotted in

Figure 5. Data were analyzed as in previous experiments. The average bisection point for

normal scenes (921ms, SD = 239) was smaller than the average bisection point for jumbled

scenes (1019ms, SD = 217), t (15) = 2.67, p < .05, d = .43. Since the bisection points varied

somewhat between Experiment1 and Experiment 3, an ANOVA with scene type (normal or

jumbled) as a within subject factor and Experiment (1 or 3) as a between subjects factor was

conducted in order to test whether these fluctuations were statistically significant. Of course, the

main effect of scene type was reliable, F (1, 31) = 12.50, p < .001, partial ƞ2 = .29, but the main

effect of Experiment and the scene-type by experiment interaction were not (both Fs (1, 31) < 1).

As in Experiment 1, Experiment 3’s results suggest that normal scenes seemed to last longer than

jumbled scenes. Together, Experiments 2 and 3 suggest that when gridlines are present, scene

structure only affects duration judgments if participants pay attention to scene structure.

General Discussion

TIME & SCENE PERCEPTION 15

The current experiments tested the effect of scene structure on perception of time. In

Experiment 1, the subjective duration of normal scenes was greater than the subjective duration

of jumbled scenes. However, Experiment 2’s results suggest that scene structure is not sufficient

to affect perceived time: when gridlines were added to normal and jumbled scenes to equate the

number of terminating lines, scene structure had no effect on perceived duration. In Experiment

3, participants performed a secondary task that required attending to scene structure (i.e. judging

whether scenes are normal or jumbled), and the subjective duration of normal scenes was greater

than jumbled scenes, even though gridlines were present in both kinds of scene. Overall, these

results suggest that scene structure affects time perception, but only when the structure is easy to

perceive (as in Experiment 1) or participants actively pay attention to structure (as in Experiment

3).

Scene Structure’s Effect on Time Perception is Not Automatic

The current findings are consistent with the hypotheses used to generate the prediction

about how scene structure would affect time perception, i.e. that time perception correlates

positively with how much perceptual information is processed (Tse et al., 2004) and perception

of non-temporal magnitudes (Walsh, 2003). However, research using simpler stimuli often

found that even task-irrelevant aspects of visual stimuli affect duration judgments. For example,

Casasanto and Boroditsky (2008) found that the length of a line affected duration estimates even

though length was irrelevant to the response. Similarly, in Experiment 1 scene structure affected

duration judgments even though it was not relevant to the duration judgment task. In contrast,

the results of Experiments 2 and 3 suggest that task-relevance matters: scene structure did not

TIME & SCENE PERCEPTION 16

affect duration judgments in Experiment 2 when participants did not have to attend to it, but did

affect duration judgments in Experiment 3 when participants did have to attend to it.

Why did task-relevance matter across Experiments 2 and 3, but not in Experiment 1 or in

much past work? It is possible that in previous work, the task-irrelevant aspects of the visual

stimulus (e.g. the length of a line as in Casasanto & Boroditsky, 2008, or scene structure in

Experiment 1) affected duration judgments because perceiving these attributes was relatively

easy to do. In contrast, in Experiment 2, the gridlines may have made perceiving scene structure

more difficult (the results of a supplementary scene classification experiment, reported below,

are consistent with this proposal). The current results therefore support the idea that perception

of time depends on more than what an observer is viewing. Perception of time also depends on

how an observer is processing what is being viewed.

Pacemaker Counter Models

Many models of time perception rely on the notion of an internal time-keeping

mechanism that consists of a pacemaker that emits pulses at some rate and a counter that collects

the pulses (for reviews see Brown, 2008; Buhusi & Meck, 2005; Droit-Volet & Meck, 2007; Irvy

& Schlerf, 2008; Zakay & Block, 1995). In such pacemaker counter models, duration judgments

are based on a comparison of how many pulses are collected to some kind of reference memory.

In some models, a gate modulates the flow of pulses from the pacemaker to the counter (Brown,

2008; Zakay & Block, 1995). Paying more attention to time causes the gate to open wider,

allowing more pulses to flow through, while paying less attention to time causes the gate to close

and/or open and close sporadically, thereby interfering with the flow of pulses from the

TIME & SCENE PERCEPTION 17

pacemaker to the counter. Essentially, paying more attention to time makes it seem like more

time is going by.

Pacemaker counter models can account for why jumbled scenes’ subjective durations are

less than normal scenes’. Jumbled scenes, being more difficult to process, would prevent

participants from paying attention to output of the pacemaker. As a consequence, fewer pulses

would be collected by the counter and less time would seem to have passed (Brown, 1985, 2008;

Brown & Boltz, 2002; Thomas & Weaver, 1975). If one assumes that additional gridlines make

processing the normal scenes more difficult, these models can also explain the results of

Experiment 2, where there was no difference in the subjective duration of normal and jumbled

scenes. Therefore, a pacemaker-counter model can readily account for the results of the first two

experiments.

However, a model with a pacemaker that is dedicated to producing pulses for the sake of

keeping track of time would have some difficulty explaining the results of Experiment 3. Recall,

in Experiment 3 gridlines were superimposed over both normal and jumbled scenes and

participants performed two tasks: one that required attending to scene structure and the other

paying attention to duration. A model that includes a dedicated pacemaker would predict that the

additional task (i.e. the scene classification task) should, if anything, make it even more difficult

for participants to attend to the output of the pacemaker (e.g. Brown, 1985, 1997, 2008; Thomas

& Weaver, 1975). According to these models, dividing attention between two tasks should cause

the activity of the gate to become more variable, making it more difficult to reveal the effect of

scene structure on duration judgments. In contrast to this prediction, Experiment 3’s results were

TIME & SCENE PERCEPTION 18

(statistically) identical to Experiment 1’s results (between experiment statistical tests did not

yield any reliable effects).

In order to explain why the effect of scene structure reemerged in Experiment 3, a model

with a dedicated pacemaker would have to propose that normal scenes with gridlines are

identified more quickly than jumbled scenes with gridlines. If less time was needed to classify

normal scenes than jumbled scenes, then more time could be spent paying attention to the output

of the pacemaker when normal scenes were presented, which would allow more pulses to be

collected, ultimately giving rise to the impression that more time had passed. Recall that

Experiment 1’s results can be explained in a similar way, but in Experiment 1 there were no

gridlines and no secondary task.

As a preliminary test of whether normal or jumbled scenes are classified faster, and

whether this effect interacts with the presence of gridlines, we conducted a speeded classification

experiment with n = 24 (11 male) observers using a 2 (normal vs. jumbled) x 2 (gridlines vs. no

gridlines) within-subject factorial design (49 trials in each cell). On each trial, a scene appeared

and the observers’ task was to classify scenes as normal or jumbled as quickly as possible. All

main effects and interactions in separate 2 (normal vs. jumbled) by 2 (gridlines vs. no gridlines)

repeated measures ANOVAs on mean response time2 (RT) and mean accuracy were significant

(response time: main effect of scene structure, F (1, 23) = 4.41, p < .05, main effect of gridlines,

F (1, 23) = 26.84, p < .01, interaction, F (1, 23) = 36.67, p < .001; accuracy: main effect of scene

structure, F (1, 23) = 10.91, p < .01, main effect of gridlines, F (1, 23) = 32.27, p < .01,

interaction, F (1, 23) = 30.92, p < .01). To follow up on the interactions, planned simple main

effects analyses (with Bonferroni corrected p-values) were conducted separately for RT and

TIME & SCENE PERCEPTION 19

accuracy to test the effect of scene structure when gridlines were present or absent. With no

gridlines, responses were faster (t (23) = 6.14, p < .001) and more accurate (t (23) = 2.69, p <

.05) for normal scenes (Ms = 588ms and 97% for RT and accuracy respectively) than for

jumbled scenes (Ms = 649ms and 95% for RT and accuracy respectively). However, when

gridlines were present, RTs were equivalent for normal (M = 653ms) and jumbled (M = 641ms)

scenes (t (23) <1), and accuracy was worse (t (23) = 5.40, p < .001) for normal scenes (M = 89%)

than for jumbled scenes (M = 96%). These results suggest that normal scenes are easier to

classify only when no gridlines are present. When gridlines are present (as was the case in

Experiment 3), these results suggests that classifying normal scenes may actually be harder than

classifying jumbled scenes. Therefore, the results are inconsistent with the idea that the effect of

scene structure reemerged in Experiment 3 because normal scenes with gridlines are processed

more quickly than jumbled scenes with gridlines. In other words, these results seem inconsistent

with the idea that participants in Experiment 3 had more time to pay attention to a hypothetical

pacemaker when normal, as opposed to jumbled scenes were being viewed.

Tse et al. (2004) proposed a modification to traditional pacemaker-counter models that

could help account for Experiment 3’s results. Tse et al. proposed that attention to a stimulus

might not only reduce the number of pulses that are missed (due to increased variability in the

operation of the switch), but could also increase the number of “pulses” that are generated, due to

an increased rate of information processing. In this view, what is being counted is not

necessarily the output of an internal pacemaker, whose primary purpose is time keeping. Rather,

observers might be keeping track of how much perceptual information has been processed over

the interval (in which case the term “pulse” might be a misleading term). We can be reasonably

confident that observers extract more information from normal scenes than jumbled scenes (even

TIME & SCENE PERCEPTION 20

if normal scenes with gridlines are harder to classify, c.f. the preceding paragraph) because

performance on a variety of tasks, including visual search, object identification, spatial judgment

and change detection, is enhanced when observers are viewing normal as opposed to jumbled

scenes (e.g. Biederman, 1972; Biederman et al., 1973; Biederman et al., 1974; Foulsham et al.,

2011; Sanocki et al., 2006; Varakin & Levin, 2008). Accordingly, when observers attend to

normal scenes, they acquire more useful information over the course of the interval, and if how

much (or how much useful) information is processed during an interval can serve as a unit of

time, then the counter would interpret the increase in information as an increase in duration.

Some past work suggests that the relationship between information processing and

perceived time is negative, which appears to contrast with the current proposal that the

relationship between information processing and perceived time is positive. In particular, Hicks,

Miller and Kinsbourne (1976) suggested that as the rate of information processing increases,

perceived duration decreases. In their experiment, participants sorted cards for 42 seconds.

Information processing was operationalized in terms of how difficult the sorting task was. The

most difficult sorting task (sorting cards by suit) required more information processing than the

simpler tasks (sorting by color, or not sorting at all). At first glance, this proposal seems

inconsistent with our own, but because the procedures and relevant time scales (about 1 second

vs. 42 seconds) are so different, there are many potential ways to reconcile the two claims. For

example, another way to interpret the Hicks et al. result is in terms of task difficulty, and not in

terms information processing. In terms of difficulty, their results and ours align: in Hicks et al.’s

results and in our own, difficulty was negatively related with perceived time, as we have argued,

based on previous results, that jumbled scenes are generally more difficult to process than

normal scenes. It is also possible that only some kinds of information can contribute positively

TIME & SCENE PERCEPTION 21

to estimated duration, whereas other kinds of information do not. There is no particular reason to

predict that color or card suit (the relevant information in Hicks et al.) would increase perceived

duration, but as we discussed in the introduction and outline further in the next section, there are

theoretical reasons to predict that information contained in scene representations would have

such an effect

Visual Cognition and Perceived Time

Even with this modification to a pacemaker-counter model, there are several alternative

ways to account for the current results. The current results cannot specify which kinds of

information are (singly or jointly) responsible for the increase in subjective duration.

“Information” is a somewhat vague term, and in the current context could refer to either

perceptual or semantic information. Both perceptual (e.g. perceived size; Ono & Kawahara,

2007) and semantic (e.g. word vs. non-word; Reingold & Merikle, 1988) characteristics of visual

stimulus can affect its subjective duration and since our jumbling procedure confounds them, we

cannot disentangle the relative contribution of each. However, the current results do appear to

rule out information conveyed by the activity of low-level feature detectors as a cause of

increased subjective duration in the context of scene perception. Normal scenes and jumbled

scenes are equivalent in terms of a host of low-level perceptual attributes (e.g. mean luminance),

thus, these low-level attributes cannot be responsible for the current effect. Without gridlines,

jumbled scenes do contain more low-level features (e.g. line terminators) than normal scenes, but

jumbled scenes’ subjective durations were compressed relative to normal scenes’ (Experiment

1). If a tally of low level features was considered good information for purposes of estimating

duration, then jumbled scenes’ subjective duration should have been greater than normal scenes’

TIME & SCENE PERCEPTION 22

in Experiment 1, but the opposite pattern was observed. Moreover, the gridlines used in

Experiment 3 controlled for any differences between normal and jumbled scenes in terms of line

terminators, yet the perceived duration of normal scenes was still greater than jumbled scenes.

If tallying low-level features did not contribute to increased subjective duration, then it is

likely that the products of visual cognition did. By visual cognition, we mean the inferential

operations carried out at mid- to high-levels of visual processing that are involved in setting up a

“workable simulation” of the scene (Cavanagh, 2011). These processes include (but are not

limited to) assigning edges to surfaces, identifying objects, and extracting information about

spatial layout and scene gist. As mentioned previously, jumbling has been shown to interfere

with observers’ ability to extract information about these higher-level scene attributes, including

spatial layout (Sanocki et al., 2006), and object identity (Biederman et al., 1973).

Additional support for the idea that the current effects were driven by visual cognition is

that the effect depended on what observers attended to (Experiments 2 and 3). Importantly,

attention has strong effects on the nature of visual cognitive operations (Cavanagh, 2011). At a

basic level, observers obtain less information from unattended than attended stimuli. At an

extreme, if a stimulus is not attended, then it may not be perceived at all (e.g. Mack & Rock,

1998). If visual cognition depends on what is attended, and time perception depends on visual

cognition, then time perception would be expected to depend on attention (see also New &

Scholl, 2009; Pariyadath & Eagleman, 2007; Tse, et al. 2004). There are at least two general

ways that attention might be relevant here.

First, it is possible that scene structure affects the degree to which a stimulus captures

attention. There is some evidence that well-structured stimuli attract attention to a greater degree

TIME & SCENE PERCEPTION 23

than less-structured stimuli (e.g. Zhao, Al-Aidroos, & Turk-Browne, in press), which lends

support to the idea that normal scenes might capture attention to a greater degree than jumbled

scenes. In contrast, evidence that low-level luminance transients capture attention, but that new

perceptual objects do not (e.g. Franconeri, Hollingworth & Simons, 2005), would run counter to

the idea that normal scenes capture attention more jumbled scenes. In any case, appealing to

greater attentional capture by normal scenes can explain only the results of Experiment 1, but not

the results of Experiments 2 and 3: there was no effect of scene structure in Experiment 2, and in

Experiment 3, participants were instructed to pay attention to scene structure. The pattern of

results across these experiments therefore suggest that factors other than attentional capture are

needed to explain why normal scenes seemed to last longer than jumbled scenes.

A second way in which attention is of relevance is in terms of the effects of attention on

of the quality and/or quantity of higher-level information in the resulting scene representation.

Normal and jumbled scenes may differ in regard to the quality and/or quantity of scene

information contained in a scene’s “workable simulation” (Cavanagh, 2011) as a result of

attentional operations. As suggested by past work comparing normal and jumbled scenes,

attentional operations probably set up more information-rich representations for normal scenes

than for jumbled scenes (e.g. Biederman, 1972; Biederman et al., 1973; Biederman et al., 1974;

Foulsham et al., 2011; Sanocki et al., 2006; Varakin & Levin, 2008), and as discussed earlier, at

least some of this information might be represented by a system that represents magnitudes,

including duration (Walsh, 2003). Whether a general magnitude system is the sole driver of

these effects cannot be determined based on the current results alone. Future research will be

needed to further clarify which specific aspects of higher-level scene processing influence

subjective duration.

TIME & SCENE PERCEPTION 24

Time perception without a dedicated timer

We have argued that the difference between normal and jumbled scenes in terms of

perceived duration cannot be easily accounted for by models that include a dedicated timekeeper

or pacemaker. Of course, we are not the first to propose that there is no internal pacemaker

dedicated to timekeeping (see Irvy & Schlerf, 2008). However, past models do not make clear

predictions about the relative subjective durations of normal and jumbled scenes that are

presented for .4 to 1.6 seconds (the range of duration used here). Some models propose “timer

free” mechanisms for tracking durations in the tens to hundreds of milliseconds range, but leave

open the possibility that some kind of dedicated pacemaker might be used to track longer

intervals (e.g. Karmarker & Buonomano, 2007).

Other models are not well specified enough to make clear predictions regarding the

relative subjective durations of normal and jumbled scenes. One such model is the

change/segmentation model (Poynter, 1989; Poynter & Homa, 1983), which proposes that

perceived duration is a function of the number of stimuli encountered during an interval, the

organization of these stimuli, and actual clock duration. However, it is not clear how these

concepts would apply in the current experiment. For example, in past research, sensory events

have been operationalized in terms of variations in sensory information over time (e.g. number of

flashing lights, Poynter & Homa, 1983), and it is not clear how (or even if) normal and jumbled

scenes differ in terms of the number of sensory events. The change/segmentation model also

proposes that not all sensory events are equally good at marking time’s passage. In general,

disorganized events that are difficult to remember will not be as good for keeping track of time

as well organized events. Normal scenes, having organization that is better suited for supporting

TIME & SCENE PERCEPTION 25

tasks that require memory (such as change detection, Varakin & Levin, 2008) would be better

markers for time’s passage than jumbled scenes, according to the change/segmentation model.

While the quality of temporal markers is clearly relevant when a several stimuli are presented

sequentially over the course of a trial (Poynter, 1983; Poynter & Homa, 1983) or when dynamic

events are used as stimuli (Boltz, 1995; 2005; Brown & Boltz, 2002; Liverence & Scholl, 2012),

it is not clear what relevance this concept could play in the current context, where the stimuli

were static.

Conclusion

In conclusion, past research using realistic scenes largely ignored how scene structure

affects time perception, and past research on time perception tended to avoid using realistic

scenes. The current experiments demonstrate that when scene structure is easy to perceive or

when participants pay attention to scene structure, the subjective duration of scenes with normal

structure is greater than the subjective duration of scenes with jumbled structure. These results

follow from the ideas that time perception correlates positively with the rate of visual

information processing (Tse et al., 2004) and perception of magnitudes (Walsh, 2003), and can

be explained by a model in which time is perceived in terms of the quality of visual-cognitive

information processing that occurred over an interval.

TIME & SCENE PERCEPTION 26

References

Appelle, S. (1972). Perception and discrimination as a function of stimulus orientation: The

“Oblique Effect” in man and animals. Psychological Bulletin, 78(4), 266-278.

Biederman, I. (1972). Perceiving real-world scenes. Science, 177, 77-80.

Biederman, I., Glass, A. L., and Stacy E. W. (1973). Searching for objects in real-world scenes.

Journal of Experimental Psychology, 97, 22-27.

Biederman, I., Rabinowitz, J. C., Glass, A. L., and Stacy E. W. (1974). On the information

extracted from a glance at a scene. Journal of Experimental Psychology, 103, 597-600.

Boltz, M.G. (1995). Effects of event structure on retrospective duration judgments. Perception

and Psychophysics, 57(7), 1080-1096.

Boltz, M.G. (2005). Duration judgments of naturalistic events in the auditory and visual

modalities. Perception and Psychophysics, 67(8), 1362-1375.

Brown, S.W. (1985). Time perception and attention: The effects of prospective versus

retrospective paradigms and task demands on perceived duration. Perception and

Psychophysics, 38, 115-124.

Brown, S.W. (1997). Attentio nal resources in timing: Interference effects in concurrent temporal

and nontemporal working memory tasks. Perception and Psychophysics, 59, 1118-1140.

Brown, S. W. (2008). Time and attention: Review of the literature. In S. Grondin (Ed.),

Psychology of Time (pp. 111-138). Bingley, UK: Emerald.

Brown, S. W., and Boltz, M. G. (2002). Attentional processes in time perception: Effects of

mental workload and event structure. Journal of Experimental Psychology: Human

Perception and Performance, 28, 600-615.

TIME & SCENE PERCEPTION 27

Buhusi, C.V. and Meck, W.H. (2005). What makes us tick? Functional and neural mechanisms

of interval timing. Nature Reviews Neuroscience, 6, 755-765.

Casasanto, D., and Boroditsky, L. (2008). Time in the mind: Using space to think about time.

Cognition, 106, 579-593.

Cavanagh, P. (2011). Visual cognition. Vision Research, 13, 1538-1551.

Cohen J.D., MacWhinney B., Flatt M., and Provost J. (1993). PsyScope: A new graphic

interactive environment for designing psychology experiments. Behavioral Research

Methods, Instruments, and Computers, 25, 257-271.

Droit-Volet, S., and Meck, W. (2007). How emotions colour our perception of time. Trends in

Cognitive Sciences,11, 499-544.

Foulsham, T., Alan, R., and Kingstone, A. (2011). Scrambled eyes? Disrupting scene structure

impedes focal processing and increases bottom-up guidance. Attention, Perception and

Psychophysics, 73, 2008-2025.

Fraisse, P. (1984). Perception and estimation of time. Annual Review of Psychology, 35, 1-36.

Franconeri, S. L., Hollingworth, A., & Simons (2005). Do new objects capture attention?

Psychological Science, 16(4), 275-281.

Hansen, B.C., and Essock, E.A. (2006). Anisotropic local contrast normalization: The role of

stimulus orientation and spatial frequency bandwidths in the oblique and horizontal

effects. Vision Research, 46, 4398-4415.

Henderson, J.M., and Hollingworth, A. (1999). High-level scene perception. Annual Review of

Psychology, 50, 243-271.

TIME & SCENE PERCEPTION 28

Hicks, R.E., Miller, G.W., and Kinsbourne, M. (1976). Prospective and retrospective judgments

of time as a function of amount of information processed. American Journal of

Psychology, 89(4), 719-730.

Irvy, R.B. and Schlerf, J.E. (2008). Dedicated and intrinsic models of time perception. Trends in

Cognitive Sciences, 12, 273-280.

Jones, B., and Huang, Y.L. (1982). Space-time dependencies in psychophysical judgment of

extent and duration: Algebraic models of the tau and kappa effects. Psychological

Bulletin, 91, 128-142.

Lebensfield, P., and Wapner, S.(1968). Configuration and space-time interdependence. The

American Journal of Psychology, 81, 106-110.

Liverence, B.M., and Scholl, B.J. (2012). Discrete events as units of perceived time. Journal of

Experimental Psychology: Human Perception and Performance, 38(3), 549-554.

New, J.J., and Scholl, B. J. (2009). Subjective time dilation: Spatially local, object-based, or a

global visual experience. Journal of Vision, 9, 1-11.

Ono, F., and Kawahara, J.-I. (2007). The subjective size of visual stimuli affects perceived

duration of their presentation. Perception & Psychophysics, 69, 952-957.

Pariyadath, V., and Eagleman, D. (2007). The effects of predictability on subjective duration.

PLoS One, 11, e1264.

Potter, M.C. (1975). Meaning in visual search. Science, 187, 965-966.

Potter, M.C. (1976). Short-term conceptual memory for pictures. Journal of Experimental

Psychology: Human Learning and Memory, 2, 509-522.

Potter, M.C., and Faulconer, B.A. (1975). Time to understand pictures and words. Nature, 253,

437-438.

TIME & SCENE PERCEPTION 29

Poynter, W.D. (1983). Duration judgment and the segmentation of experience. Memory and

Cognition, 11(1), 77-82.

Poynter, W.D. (1989). Judging the duration of time intervals: a process of remembering

segments of experience. In I. Levin, and D. Zakay (Eds.), Time and Human Cognition: A

Life-span perspective (pp. 305-331). Amsterdam, Netherlands: Elsevier.

Poynter, W.D., and Homa, D. (1983). Duration judgment and the experience of change.

Perception and Psychophysics, 33(6), 548-560.

Reingold, E. M., and Merikle, P. M. (1988). Using direct and indirect measures to study

perception without awareness. Perception & Psychphysics, 44, 563-575.

Sanocki, T., Michelet, K., Sellers, E., and Reynolds, J. (2006). Representation of scene layout

can consist of independent, functional pieces. Perception and Psychophysics, 68, 415-

427.

Thomas, E.A.C., and Weaver, W.B. (1975). Cognitive processing and time perception.

Perception and Psychophysics, 17, 363-367.

Tse, P. U., Intriligator, J., Rivest, J., and Cavanagh, P. (2004). Attention and the subjective

expansion of time. Perception & Psychophysics, 66, 1171-1189.

Varakin, D. A., and Levin, D. T. (2008). Scene structure enhances change detection. The

Quarterly Journal of Experimental Psychology, 61, 543-551.

Walsh, V. (2003). A theory of magnitude: Common cortical metrics of time, space and quantity.

Trends in Cognitive Sciences, 7, 483-488.

Xuan, B., Zhang, D., He, S., and Chen, X. (2007). Larger stimuli are judged to last longer.

Journal of Vision, 7, 1-5.

TIME & SCENE PERCEPTION 30

Yokosawa, K. and Mitsumatsu, H. (2003). Does disruption of a scene impair change detection?

Journal of Vision, 3, 41-48.

Zakay, D., and Block, R. A. (1995). An attentional-gate model of prospective time estimation. In

M. Richelle, V. De Keyser, G. d'Ydewalle, & A. Vandierendonck (Eds.), Time and the

dynamic control of behavior (pp. 167-178). Liège , Belgium : Universite de Liege.

Zhao, J., Al-Aidroos, N., & Turk-Browne, N. B. (in press). Attention is spontaneously biased

towards regularities. Psychological Sciences.

TIME & SCENE PERCEPTION 31

Footnotes

1 See Varakin and Levin (2008) for details on how the scenes were created.

2 Mean response times were calculated after eliminating error trials (6% of all trials) and outliers

(trials with a response time that fell more than 3.5 SDs away from an individual participant’s cell

mean, 1.3% of all trials).

TIME & SCENE PERCEPTION 32

Figures

Figure 1: Stimulus examples from Experiment 1. A) A normal scene. B) A jumbled scene.

TIME & SCENE PERCEPTION 33

Figure 2: Experiment 1 results. Data points are averages based on the actual proportion of long

responses made by participants (solid circles = normal scenes; open squares = jumbled scenes).

Lines are averages of predictions of the logit models that were fit individually to each

participant’s data (solid line = normal scenes, dashed line = jumbled scene).

TIME & SCENE PERCEPTION 34

Figure 3: Stimulus examples from Experiments 2 and 3. A) A normal scene. B) A jumbled

scene.

TIME & SCENE PERCEPTION 35

Figure 4: Experiment 2 results. Data points are averages based on the actual proportion of long

responses made by participants (solid circles = normal scenes; open squares = jumbled scenes).

Lines are averages of predictions of the logit models that were fit individually to each

participant’s data (solid line = normal scenes, dashed line = jumbled scene).

TIME & SCENE PERCEPTION 36

Figure 5: Experiment 3 results. Data points are averages based on the actual proportion of long

responses made by participants (solid circles = normal scenes; open squares = jumbled scenes).

Lines are averages of predictions of the logit models that were fit individually to each

participant’s data (solid line = normal scenes, dashed line = jumbled scene).