Parallel and serial grouping of image elements in visual perception

18
Journal of Experimental Psychology: Human Perception and Performance Parallel and Serial Grouping of Image Elements in Visual Perception Roos Houtkamp, and Pieter R. Roelfsema Online First Publication, August 23, 2010. doi: 10.1037/a0020248 CITATION Houtkamp, R., & Roelfsema, P. R. (2010, August 23). Parallel and Serial Grouping of Image Elements in Visual Perception. Journal of Experimental Psychology: Human Perception and Performance. Advance online publication. doi: 10.1037/a0020248

Transcript of Parallel and serial grouping of image elements in visual perception

Journal of Experimental Psychology: HumanPerception and Performance

Parallel and Serial Grouping of Image Elements in Visual PerceptionRoos Houtkamp, and Pieter R. Roelfsema

Online First Publication, August 23, 2010. doi: 10.1037/a0020248

CITATION

Houtkamp, R., & Roelfsema, P. R. (2010, August 23). Parallel and Serial Grouping of Image

Elements in Visual Perception. Journal of Experimental Psychology: Human Perception and

Performance. Advance online publication. doi: 10.1037/a0020248

Parallel and Serial Grouping of Image Elements in Visual Perception

Roos HoutkampOtto von Guericke Universitaet

Pieter R. RoelfsemaVU University

The visual system groups image elements that belong to an object and segregates them from otherobjects and the background. Important cues for this grouping process are the Gestalt criteria, andmost theories propose that these are applied in parallel across the visual scene. Here, we find thatGestalt grouping can indeed occur in parallel in some situations, but we demonstrate that there arealso situations where Gestalt grouping becomes serial. We observe substantial time delays whenimage elements have to be grouped indirectly through a chain of local groupings. We call thischaining process incremental grouping and demonstrate that it can occur for only a single object ata time. We suggest that incremental grouping requires the gradual spread of object-based attentionso that eventually all the object’s parts become grouped explicitly by an attentional labeling process.Our findings inspire a new incremental grouping theory that relates the parallel, local groupingprocess to feedforward processing and the serial, incremental grouping process to recurrent pro-cessing in the visual cortex.

Keywords: Gestalt grouping, perceptual organization, parallel and serial processing, visual attention,contour integration

The image that enters our eyes is broken down into manyfragments by neurons in the retina and the next few processingstages of the visual system, which have small receptive fieldscovering only a small part of the visual world. The neurons at theseearly processing levels are sensitive to elementary features such asorientation, color, and direction of motion. Thus, the initial repre-sentation of the visual world is a fragmented one where the variousfeatures of the same object are represented separately by differentneurons that are distributed across a number of areas in the visualcortex. There exists a large explanatory gap between this frag-mented representation and our perception given that we live in aworld that is filled with coherent and spatially extended objects.The implication is that there must be efficient and flexible pro-cesses at work that group the image elements that are part of thesame object and segregate them from elements that belong todifferent objects and the background (Treisman, 1996; Von derMalsburg, 1999). The grouping of image elements at differentspatial locations occupied by the same spatially extended objectdemands a complex and context-sensitive process (Roelfsema,

2006; Shadlen & Movshon, 1999), and yet, we as observers arevery apt in judging which features belong together and whichfeatures do not.

The Gestalt psychologists were the first to describe many of thecues that are used by the visual system for grouping and segmen-tation (Koffka, 1935; Wertheimer, 1923). One example of a Ges-talt rule is that of similarity, stating that similar elements in a visualscene tend to be grouped in perception. Other Gestalt laws are thatof proximity, implying that nearby elements are grouped; the lawof connectedness, suggesting that connected elements are grouped;the law of common fate, stating that elements are grouped if theymove in the same direction; and the law of good continuation,stating that well-aligned contours are assigned to the same percep-tual object (e.g. Koffka, 1935; Kubovy, Holcombe, & Wagemans,1998; Rock & Palmer, 1990; Wertheimer, 1923). There is noconsensus, however, about the precise implementation of theseGestalt grouping rules, and the research field has been divided bya number of distinct and incompatible theories.

There are three major theories about the implementation of theperceptual grouping operations in the visual brain. The first theoryholds that perceptual grouping is implemented very efficiently sothat the analysis of a stimulus can be completed rapidly after itsappearance (Ghose & Maunsell, 1999; Riesenhuber & Poggio,1999; Tovee, 1994). This view holds that most perceptual group-ing operations are carried out in parallel when the visual informa-tion is propagated from lower to higher areas of the visual cortexduring feedforward processing. Neurons in lower areas respond tothe individual image elements, and neurons in higher areas (some-times called cardinal cells) are selective for more complex featureconstellations (Barlow, 1972; see Figure 1A). The activity of aneuron that is tuned to a red vertical bar, for example, codes thepresence of this feature conjunction in the display, and similarly,neurons that are tuned to a particular shape code for a set of lowerlevel features in a specific arrangement. These hardwired detectors

Roos Houtkamp, Department of Cognitive Biology, Otto von GuerickeUniversitaet; Pieter R. Roelfsema, Department of Experimental Neuro-physiology, Center for Neurogenomics and Cognitive Research, VU Uni-versity.

We thank Talia Bets for her help in data collection. This work wassupported by a grant of the McDonnell Pew Program in Cognitive Neu-roscience, a Human Frontier Science Program Young Investigators grant,a grant from the European Union (EU IST Cognitive Systems, Project027198 “Decisions in Motion”), and a Nederlandse Organisatie voorWetenschappelijk Onderzoek VICI grant.

Correspondence concerning this article should be addressed to RoosHoutkamp, Department of Cognitive Biology, Otto von Guericke Univer-sitaet, Leipzigerstrasse 44, Haus 91, 39120 Magdeburg, Germany. E-mail:[email protected]

Journal of Experimental Psychology: © 2010 American Psychological AssociationHuman Perception and Performance2010, Vol. ●●, No. ●, 000–000

0096-1523/10/$12.00 DOI: 10.1037/a0020248

1

of feature conjunctions have been called base groupings (Roelf-sema, 2006).

A limitation of the coding of perceptual groups by dedicatedneurons is that there are more shapes than can be coded in thismanner (Von der Malsburg, 1999). This is particularly evidentfor new shapes that have never been seen before because it isunlikely that the relevant shape-selective cell would be avail-able. To code groupings in these situations, it would be conve-nient if there was a manner to identify the neurons that code thelower level features of the object and to set them aside fromneurons coding features of different objects. The advantage ofsuch a labeling process is that it adds flexibility because thelabel could also be distributed across the low-level representa-tions of image elements of a new object that was never seenbefore. Along these lines of reasoning, a second theory pro-posed the idea of binding-by-synchrony. This theory holds thatimage elements that are grouped together in perception arelabeled in the visual cortex by neuronal synchrony (Engel,König, Kreiter, Schillen, & Singer, 1992; Singer & Gray, 1995).Neurons coding features of the same perceptual object wouldfire their action potentials at approximately the same time,whereas neurons coding the features of different objects wouldfire independently (see Figure 1B). Thus, the precise timing ofneuronal responses is hypothesized to act as a “label” that bindsall the image elements that belong together. Modeling studieshave demonstrated that the appearance of synchrony betweenneurons does not occur instantaneously, but that the synchro-nicity has to spread gradually from one image element to thenext. Thus, binding-by-synchrony could explain why in sometasks the time required for grouping increases with the numberof image elements that have to be grouped in perception(Behrmann, Zemel, & Mozer, 1998; Sporns, Gally, Reeke, &Edelman, 1989).

The third theory holds that features to be bound in perceptionare labeled by attention. Such a role for attention in binding wasfirst proposed by the feature integration theory (FIT) of Treismanand Gelade (1980), which suggested that focal attention is required

to group different features, such as a color and a shape present atthe same spatial location. Notably, the FIT suggested at the sametime that Gestalt cues that establish groupings between features atdifferent spatial locations are evaluated preattentively and in par-allel, and this view has been adopted by other theories as well (e.g.,Bergen & Julesz, 1983; Julesz, 1981; Neisser, 1967; Treisman,1982; Treisman & Gelade, 1980; Treisman & Gormican, 1988;Wolfe & Bennett, 1997). However, there are also studies that haveprovided evidence to the contrary by demonstrating that attentionis required for the evaluation of Gestalt grouping cues, at least insome situations (Houtkamp, Spekreijse, & Roelfsema, 2003;Roelfsema, 2006).

The strongest case for an attention-demanding Gestalt group-ing process is presumably given by contour grouping (curve-tracing) tasks where observers have to indicate whether twocontour elements are part of the same elongated curve or part ofdifferent curves. The response time in this task increases ap-proximately linearly with the number of contour elements thathave to be grouped together, as if a process is at work thatgroups the contours incrementally, one at a time (Jolicoeur,Ullman, & MacKay, 1986, 1991). Furthermore, an earlier study(Houtkamp et al., 2003) demonstrated that attention graduallyspreads from one contour element to the next until an entirecurve becomes labeled by object-based attention. In this earlierstudy, we used a dual-task design in which the primary task wasto locate the end of a target curve that started at a centralfixation point, while ignoring a second, distracting curve (seeFigure 7A for comparable stimuli). To probe the distribution ofattention at different stages of the curve-tracing process, wepresented colors on different segments of the target and dis-tractor curve at various time intervals during the trial; thesecondary task was for the observer to report one of thesecolors. Early in the trials, subjects were accurate in reportingcolors of the target curve close to central fixation. Later in thetrial, however, colors were reported reliably for all segments ofthe target curve. This result implies that attention spreadsgradually across the entire target curve.

Figure 1. Three models for the neurophysiology of perceptual grouping. (A) Cardinal cells in higher areas thatare tuned to constellations of lower level features. (B) Patterns of synchronous discharges are used to label imageelements of the same perceptual group. The patterns of synchrony are thought to be propagated gradually acrossan object representation. (C) A perceptual group is labeled with an enhanced neuronal response. The spread ofthe enhanced firing rate is a gradual, time-consuming process.

2 HOUTKAMP AND ROELFSEMA

These findings inspire what we refer to as an incrementalgrouping theory (IGT) where perceptual grouping takes placeby the spread of object-based attention along contour elementsthat are colinear and connected to each other. In this view, thespread of attention makes the groupings explicit (i.e., accessiblefor report). In neurophysiological experiments, we have dem-onstrated that the gradual spread of attention is implemented inthe visual cortex as the spread of an enhanced neuronal firingrate through recurrent corticocortical connections between neu-rons coding contour elements that are in each other’s goodcontinuation (Roelfsema, Lamme, & Spekreijse, 1998). Theenhanced neuronal activity could spread through horizontalconnections between neurons in the same cortical area andthrough feedback connections from higher to lower areas (re-viewed by Roelfsema, 2006; see Figure 1C). The dynamics ofsuch a spread of enhanced activity across the representation ofa spatially extended object has been simulated in neural net-work modeling studies (Grossberg & Raizada, 2000; Sha’ashua& Ullman, 1988). Thus, the IGT agrees with the binding-by-synchrony theory by suggesting that image elements to begrouped in perception have to be labeled in the visual cortex,but now the label is enhanced neuronal activity instead of thesynchronicity of neuronal discharges.

Here, we set out to test the critical differences between thesethree theories. Our first experiments (Experiments 1 and 2)investigated whether Gestalt grouping occurs rapidly and inparallel or whether it involves a serial, time-consuming process,and we delineate the critical differences between tasks thatpermit parallel Gestalt grouping and conditions where Gestaltgrouping requires serial processing. We establish conditionswhere grouping by good continuation, common fate, proximity,and similarity is associated with serial processing with delays ofup to hundreds of milliseconds. This demonstration implies thatGestalt grouping is not always carried out by a preattentive,unlimited capacity mechanism. Our third experiment tried todistinguish between the binding-by-synchrony and binding-by-enhanced-activity theories. Although these are neurophysiolog-ical theories that are usually addressed with neurophysiologicaltechniques (see, e.g., Palanca & DeAngelis, 2005; Roelfsema,Lamme, & Spekreijse, 2004; Thiele & Stoner, 2003), there is acritical difference between them that we tested behaviorally.The binding-by-synchrony theory holds that multiple incremen-tal groups can be established in parallel (Engel et al., 1992;Singer & Gray, 1995) because every object representation canbe labeled with its own unique synchronous pattern of activity.Theories based on an enhanced firing rate, like the IGT, rathersuggest that there is only one label: Responses are eitherenhanced or they are not so that the labeling with an enhancedresponse can occur for only one object at a time. Experiment 3shows that the serial, incremental perceptual grouping processhappens for only one object at a time, and it thereby providesevidence in support of the IGT and against binding-by-synchrony.

Experiment 1. Conditions for Serial andParallel Grouping

In our Experiments 1A and 1B, we examined whether percep-tual grouping on the basis of good continuation and common fate

occurs in parallel or serially. We tried to isolate the conditions thatpermit the evaluation of these grouping cues in parallel and theconditions where a serial process is required.

There is a controversy in the literature regarding the efficiencyof grouping by good continuation, with some studies demonstrat-ing parallel processing and others showing the requirement of aserial grouping mechanism. The so-called pathfinder studies pro-vided evidence for a parallel process that is sensitive to the goodcontinuation of contour elements. In these studies, observers areasked to detect the presence of a string of Gabor elements that arealigned colinearly to form a curved path on a background ofrandomly oriented elements (see Figure 2B; Field, Hayes, & Hess,1993; Kovacs & Julesz, 1993). The percept of the path has toderive from a process that integrates the relative orientation of theelements along the path and cannot be obtained by scrutinizing oneelement at a time. Nevertheless, the path appears to pop out (Fieldet al., 1993), which suggests that grouping on the basis of goodcontinuation occurs in parallel across the visual field.

However, we have mentioned a number of other studies in theintroduction that demonstrated that the grouping of image ele-ments on the basis of connectedness and good continuation isassociated with substantial delays (Jolicoeur et al., 1986, 1991;Pringle & Egeth, 1988; Roelfsema, Scholte, & Spekreijse, 1999).These delays occur, for example, in a curve-tracing task where thestimuli consist of multiple curves and participants have to judgewhether contour elements belong to the same or to different curves(see Figure 2E). Processing time in this task increases linearly withthe length of the curves, which implies that good continuation isnot invariably evaluated by an unlimited capacity mechanism. Weshowed that attention gradually spreads over the curve that istraced from attended contour segments to other segments that arecolinear and connected to them, until the entire target curve hasbeen labeled with attention (Houtkamp et al., 2003; see alsoScholte, Spekreijse, & Roelfsema, 2001). Our Experiment 1Aaimed to elucidate the source of these conflicting results: Weexplored when perceptual grouping of image elements can occur inparallel and when it requires a serial process. We tried to isolatethe critical difference between curve-tracing studies that are asso-ciated with a serial grouping process and pathfinder studies wherea parallel process suffices.

Experiment 1A. Serial and Parallel Grouping byGood Continuation

Stimuli of the first experiment consisted of either a single targetcurve on a background or a target curve together with a distractorcurve (see Figure 2A). The target curve was always connected tothe fixation point, and the subjects’ task was to indicate which oftwo colored circles fell on this curve. The two circles could appearat one of three positions: on the target curve, on the distracter curvethat was not connected to the fixation point, or on the background.When a parallel process is involved in contour grouping, the timerequired to identify the color of the circle on the target curveshould not depend on the circle’s position, but if response timesare slower for positions farther along the curve, then we canconclude that a serial process is involved.

We considered three factors that differ between the previouspathfinder and curve-tracing studies, and that therefore could beresponsible for the apparently discrepant findings. First, the path-

3PARALLEL AND SERIAL GROUPING OF IMAGE ELEMENTS

finder studies use a background of randomly oriented Gaborpatches that could influence the integration of contour segments,but these elements have until now not been presented in curve-tracing studies (compare Figures 2C and 2D). Second, the path-

finder studies use oriented Gabor patches that might be treateddifferently from the continuous curves that have been used incurve-tracing studies (see Figure 2D vs. 2E). Third, the pathfinderstudies require the detection of a single colinear path in a back-

Figure 2. (A) A schematic representation of the stimuli. The stimuli consisted of a target curve connected to thefixation point, and (in some cases) a distractor curve. One red and one green colored circle were presented. One ofthese fell on the first (1), middle (2), or last (3) part of the target curve, and the other circle fell on the correspondinglocation on distractor curve (if present) or on the background (if there was no distractor curve). The task of the subjectwas to indicate the color of the circle on the target curve. (B–E) The four stimulus conditions. For clarity, the contrastof the Gabor elements on the curves has been enhanced; in the actual task, they had the same luminance as thebackground elements. (B) In the OneBackground condition, the stimulus consisted of the target curve and backgroundelements, but there was no distractor curve. (C) The stimulus of the TwoBackground condition contained a target anda distractor curve, as well as background elements. (D) In the TwoAlone condition, the two curves were composedof Gabor elements; (E) in the TwoContinuous condition, they were continuous.

4 HOUTKAMP AND ROELFSEMA

ground of randomly oriented elements (see Figure 2B), whereasthe curve-tracing task requires the segregation of a curve from oneor more curves with the same degree of colinearity (see Figure2D). Our first experiment examined which of these three factorsdetermines whether contour grouping occurs in parallel or serially.

Method.Participants. Seven subjects participated in Experiment 1A

(ages 20–40 years, five women). All reported normal or corrected-to-normal visual acuity. One of the subjects was an author; theothers were healthy volunteers, naive about the purpose of theexperiment. They were paid €10.50 for their participation in asingle 90-min session including short breaks after every block.

Stimuli and task. The stimulus monitor was located 78 cm infront of the subject, and the diagonal of the display subtended 33°.Figure 2A shows the general design of the stimuli. The subjectssaw one or two curves and two circles: one of these circles was redand the other one was green. The target curve was connected to thefixation point, and the distractor curve, if present, was not. It wasthe subject’s task to report the color of the circle on the targetcurve. We transformed a pathfinder stimulus into a curve-tracingstimulus in three steps. Our OneBackground condition (see Figure2B) consisted of a target curve that was defined by colinearlyoriented Gabor patches on a background with randomly orientedelements. In the TwoBackground condition (see Figure 2C), weadded a distractor path of Gabor elements; in the TwoAlonecondition (see Figure 2D), we removed the background elements;and in the TwoContinuous condition (see Figure 2E), the Gaborelements were replaced by a continuous curve.

The red and green circles (1.4° in diameter) appeared on thefirst, middle, or last part of the curves, always at the same eccen-tricity (6.5°). One of the circles was presented on one of the threepossible target locations and the other circle was presented at thecorresponding location on the distractor curve or background (i.e.,both at Location 1, 2, or 3 in Figure 2A). Gabor patches wereplaced on the curves with an orientation parallel to the curve. Sixtypatches were placed on each curve, with a spatial jitter of 6 pixels,drawn from a uniform distribution. In the conditions with a back-ground, randomly oriented patches were placed on a grid with thesame density and spatial jitter as the curve elements. We used thestimuli of Figures 2B–2E (note that in the figures the contrast ofthe curve elements is enhanced for clarity; in the actual task, theyhad a contrast of 70% that was identical to the contrast of thebackground elements) and versions that were rotated by multiplesof 60° as well as mirrored versions, yielding a total of 12 stimulustypes. The four conditions were tested in a blocked design with theorder of conditions counterbalanced across subjects.

Procedure. A trial started with the presentation of a centralfixation point for 300 ms. Subjects were encouraged not to movetheir eyes by telling them that the task could be accomplished bestwhen maintaining fixation. After the fixation period, the stimulusappeared (together with the red and green circles), and it stayed onthe screen until the subject responded or 5 s had passed. Thesubject’s task was to decide as fast and accurately as possiblewhether the circle on the target curve was green or red by pressinga button with their left or right thumb, respectively. When theymade an error, subjects heard a beep. Stimuli to which an incorrectresponse was made were repeated on a later trial.

Results. The response times and error rates for the four con-ditions as function of target position are presented in Figure 3A.

There was no speed–accuracy trade-off as error rates increasedwith increasing response times. Mean response times on correcttrials were analyzed with an analysis of variance (ANOVA) withcondition and target position as factors. There was a main effect ofcondition on reaction time, F(3, 18) � 15.04, p � .001, �partial

2 �.726. Post hoc comparisons indicated that differences in reactiontimes between conditions were all significant ( p � .02, TukeyHSD). Furthermore, there was a main effect of target position onresponse times, F(2, 12) � 35.35, p � .001, �partial

2 � .855. Posthoc comparisons indicated that differences in reaction times be-tween the three possible target positions were all significant ( p �.001, Tukey HSD).

The ANOVA also revealed an interaction between conditionand target position, F(6, 36) � 11.73, p � .001, �partial

2 � .674.Numbers next to the curves in Figure 3A show the slopes of thedifferent conditions as the difference in response times betweenthe first and the last target position divided by the distance indegrees measured along the curve (26.5°). The slope in the One-Background condition (1 ms/degree) was not significantly differ-ent from zero: within-condition ANOVA, F(2, 12) � 0.23, p � .7,�partial

2 � .037. For the other three conditions, response timesincreased with the distance from the fixation point (first vs. secondposition and second vs. third position: p � .001, Tukey HSD). The

Figure 3. (A) Mean response times as a function of target position for thefour conditions in Experiment 1A. Squares indicate average reaction time.Grouping speed (in ms/degree) is shown next to the curves. Error barsindicate SEM. Bars on the x-axis show the percentage of errors. (B) Localoperators can distinguish between a set of image elements with a colinearconfiguration (Set 1) and one with random orientations (Set 2). (C) Imageelements at Locations 1 and 5 are not directly grouped, but only through achain of other grouped image elements. The evaluation of these transitivegroupings apparently requires a serial process.

5PARALLEL AND SERIAL GROUPING OF IMAGE ELEMENTS

slope in the OneBackground condition was shallower than that inthe other conditions ( p � .001). The slope in the TwoContinuouscondition did not differ significantly from that in the TwoAlonecondition, but the slope in the TwoBackground condition wassteeper than those in the other conditions ( p � .05 for all threecomparisons).

Discussion. The present results demonstrate that contourgrouping is parallel when one curve has to be segregated from abackground of randomly oriented elements (see Figure 2B), but itis serial when a target curve has to be segregated from an equallycolinear distractor curve. In the three conditions with two curves,the reaction time increased with the length of the curves that hadto be grouped together. The continuity of curve segments had littleeffect on contour grouping as the slopes in the conditions withcontinuous curves (see Figure 2E) and colinearly oriented Gaborpatches (see Figure 2D) were similar. This is consistent with theidea that grouping on the basis of connectedness and good con-tinuation is implemented by similar processes. The speed of group-ing of Gabor elements into one of two curves was reduced by abackground of randomly oriented Gabor patches (see Figure 2C),presumably because some background elements may have beengrouped accidentally with the target curve.

A linear increase in response time does not, by itself, provideconclusive evidence for a serial process (Townsend, 1971, 1990;Verghese, 2001). In visual search, for example, the linear increaseof response time with the number of distractors can be modeled asa noisy parallel process or as a parallel process with limitedcapacity. In the curve-tracing task, however, the response timeincreases linearly with the length of the curve that has to be traced(Jolicoeur et al., 1986, 1991). Subjects start to trace at the begin-ning of the curve, and we showed that their attention spreadsgradually toward the end (Houtkamp et al., 2003). The data aretherefore most consistent with a genuine serial process whereattention gradually spreads along the curve.

The results of Experiment 1A resolve the apparent discrepancybetween previous studies suggesting either parallel or serial group-ing of colinear contour elements. Studies that suggested thatGestalt criteria are applied in parallel across the visual field (Fieldet al., 1993; Kovacs & Julesz, 1993) investigated grouping ofimage elements of a single object, but did not require the segre-gation of equally coherent objects. Other studies demonstrated thatparallel grouping of nearby colinear image elements can occurduring visual search (Gilchrist, Humphreys, Riddoch, & Neumann,1997) and during the formation of Kanizsa subjective figures(Davis & Driver, 1994). In these cases, the formation of localgroups by colinearity suffices, just as in our OneBackgroundcondition. Colinearity detection can be accomplished in parallel bylocal operators (receptive fields) sensitive to the degree of colin-earity (Gigus & Malik, 1991; see Figure 3B). These groupings canbe represented by dedicated neurons that have been called basegroupings (Roelfsema, 2006). Support for the required base group-ing comes from neurophysiological studies showing that neuronsin higher visual areas are tuned to the spatial configuration ofcontour elements in their receptive field (Brincat & Connor, 2006;Pasupathy & Connor, 2001).

However, in most of our conditions, these base groupings do notsolve the task. If there are two curves, a computation of the localdegree of colinearity does not suffice because the elements of bothcurves are equally collinear, and it is not clear which of them

belong together. To determine which image elements belong to thesame overall shape, the information about multiple local groupingshas to be combined. This is illustrated in Figure 3C, where it canbe seen that some parts of a curve are only related indirectlythrough a chain of intermediate groupings, making grouping atransitive process (transitivity of grouping means that if Item 1groups with Item 2 and 2 with 3, then 1 also groups with 3). Wepropose that these additional, transitive groupings are made ex-plicit by a time-consuming incremental grouping process. Incre-mental grouping is presumably accomplished in the visual cortexby a gradual spread of enhanced neuronal activity over the repre-sentation of the image elements that are grouped (see Figure 1C;reviewed by Roelfsema, 2006). Correspondingly, at a psycholog-ical level, attention spreads across the to-be-grouped items. Indeed,we showed previously that incremental grouping in the curve-tracing task is associated with a time-consuming spread of visualattention from contour segments at the start of the target curve toother segments of this curve until attention “labels” all of itscontour elements (Houtkamp et al., 2003). Elements that are la-beled are thereby segregated from elements that are not labeled,such that grouping and segregation work as complementary pro-cesses.

Experiment 1B. Serial and Parallel Grouping byCommon Fate

The distinction between base and incremental grouping repre-sents a drastic deviation from one of the theories mentioned in theintroduction according to which Gestalt grouping occurs preatten-tively (Bergen & Julesz, 1983; Julesz, 1981; Neisser, 1967; Tre-isman, 1982; Treisman & Gelade, 1980; Treisman & Gormican,1988; Wolfe & Bennett, 1997), by a parallel, efficient process(Ghose & Maunsell, 1999; Riesenhuber & Poggio, 1999; Tovee,1994). It is therefore important to investigate whether our resultsgeneralize to other, higher level grouping cues. We next investi-gated grouping of image elements by the Gestalt rule of commonfate, which states that elements moving in the same direction tendto be grouped in perception. We used a design that was similar tothat of Experiment 1A.

Method.Participants. Eight new subjects who reported normal or

corrected-to-normal visual acuity participated in the second exper-iment (ages 18–30 years, seven women). One subject was ex-cluded from analyses because of an excessive error rate (40%) onthe last target position in the TwoBackground condition. The othersubjects performed much better than chance level for all targetpositions in the four conditions (error rates �26%). One of thesubjects was an author, the others were healthy volunteers, naiveabout the purpose of the experiment. They were paid €10.50 fortheir participation in a single 90-min session including short breaksafter every block.

Stimuli and procedure. The general layout of the stimuli wasthe same as in the first experiment (see Figure 2A), but now theelements of the curve(s) were defined by common fate in three ofthe conditions, whereas the TwoContinuous condition used con-tinuous curves as in Figure 2E. One hundred groups of eight dots(each 3 pixels in diameter) were placed on an imaginary curvestimulus with a spatial jitter of 2 pixels. Every group was drawn ona circle with a radius of 8 pixels (see Figure 4A). At any given

6 HOUTKAMP AND ROELFSEMA

frame, one of the eight dots in each group was shown. On the nextframe, the dot located next to the previous one was shown. Usinga refresh rate of 60 Hz, this resulted in the impression of a dotrotating with a rate of 1 cycle per 133 ms. On half of the trials, thedots on the curve(s) rotated clockwise and on the other halfcounterclockwise. The phases of the dots on the curve(s) wereidentical, and this permitted grouping of these dots by commonfate. The background, if present, was composed of similar groupsof dots rotating in the opposite direction with random phases. Thefour conditions of this experiment were analogous to the ones ofthe previous experiment (see Figure 2), but we gave the subjectsmore time to respond (10 s instead of 5 s).

Results. Figure 4B shows the response times on correct trialsas well as the error rates as a function of target position. AnANOVA with condition and target position as factors revealed amain effect of condition on response time, F(3, 18) � 15.81, p �.001, �partial

2 � .725. Post hoc comparisons showed that all con-ditions differed significantly from each other ( p � .001, TukeyHSD, except OneBackground vs. TwoAlone: p � .003). Therealso was a main effect of target position, F(2, 12) � 36.53, p �.001, �partial

2 � .859, and the differences in reaction times betweenpositions were all significant ( p � .001, Tukey HSD). Further-more, the ANOVA revealed an interaction between condition andtarget position, F(6, 36) � 13.50, p � .001, �partial

2 � .692. A

Figure 4. (A) A schematic representation of the stimulus in Experiment 1B. The left panel shows four groupsof eight dots of the target curve (T) and seven groups of the background (B). At any given video frame, one ofthe eight dots in each group was shown (black dots in the figure; in the actual task, these were white against agrey background with 40% contrast). On the next frame, the dot located next to the previous one was shown.On half of the trials, the dots on the curve(s) rotated clockwise and on the other half counterclockwise. The dotson the curve(s) had the same phase, causing grouping by common fate. Elements of the background rotated inthe opposite direction and with a random phase. (B) Mean response times as a function of target position for thefour conditions of Experiment 1B. Squares indicate average reaction time and numbers next to the curves showthe grouping speed in ms/degree. Error bars indicate SEM. Bars on the x-axis show the percentage of errors.

7PARALLEL AND SERIAL GROUPING OF IMAGE ELEMENTS

within-condition ANOVA revealed that response times in the One-Background condition differed between target positions, F(2, 12) �5.96, p � .02, �partial

2 � .498. This was due to a difference betweenthe middle and the last ( p � .01, Tukey HSD) but not the first and themiddle ( p � .85, Tukey HSD) target position. For the other condi-tions, response times increased with the distance from the fixationpoint ( p � .001, Tukey HSD for all comparisons). Planned compar-isons revealed that the slope was shallower in the OneBackgroundcondition than in the TwoAlone condition, F(2, 12) � 27.94, p �.001, �partial

2 � .823. The slope in the TwoContinuous conditionwas marginally shallower than that in the TwoAlone condition,F(2, 12) � 3.44, p � .07, �partial

2 � .365, which in turn wasshallower than the slope in the TwoBackground condition, F(2,12) � 9.42, p � .01, �partial

2 � .611.Discussion. We reproduced the findings of Experiment 1A

with the Gestalt rule of common fate. Apparently, a parallelprocess is capable of detecting base groupings, that is, local groupsof coherently moving image elements if they are presented on anincoherent background, even though coherent motion is repre-sented only in higher visual areas such as the middle temporal area(for a review, see Spillmann & Werner, 1996). But if the imagecontains multiple objects and local groupings have to be combinedin a transitive manner, the time required for grouping increasesapproximately linearly with the number of elements that have to begrouped together. As in Experiment 1A, the speed of the incre-mental grouping process decreased in the presence of backgroundelements. This can be explained if these elements form spuriousgroupings with elements of the target curve that require additionalprocessing time to be discarded.

Experiment 2. Grouping by Proximity and Similarity

Experiment 2 investigated grouping by proximity and similarityand the interaction between these two grouping cues to examinewhether the serial grouping process observed for good continua-tion and common fate also generalizes to these Gestalt groupingcues. We hypothesized that base groupings are extracted in paralleland then transitively combined as a chain during incrementalgrouping. If this hypothesis is correct, then the speed of incremen-tal grouping should depend on the size of the base groupings. If anumber of elements of the target curve form a single large chunk(or base grouping) that can be added to an evolving incrementalgroup in one step, then incremental grouping should proceedrapidly, whereas it should slow when the base groupings aresmaller (see Mahoney & Ullman, 1988, for a related proposal).This experiment also allowed us to control for an alternativeinterpretation of Experiment 1, namely that the observed serialityis due to a postgrouping evaluation process rather than a serial,incremental grouping process.

Experiment 2A. Serial Proximity Grouping

We tested the influence of the size of base groupings on thespeed of incremental grouping by exploring how color similarityinteracts with grouping by proximity. The upper left panel inFigure 5A shows the basic design of the experiment. We presentedtwo strings of dots that were defined by the local proximityrelationships between the dots. Dots belong to the same string ifthey are close together, or if they are grouped transitively, though

a chain of dots that is in each other’s proximity. In addition, wevaried grouping by similarity within and between the two stringsby changing the colors of the dots.

We conjectured that similarity grouping should aid proximitygrouping if colors of one string are the same and different from thecolors of the other string (see Figure 5A, lower left panel). In thatcase, we expected large base groupings and a maximal groupingspeed. On the other hand, if grouping occurs in parallel and thedelays in response time are caused by a postgrouping serial eval-uation process, its speed should not be affected by the similaritybetween strings. Although an initial grouping process might bemore or less efficient, the resulting groupings on which the post-grouping process works should not depend on other objects in thescene.

Furthermore, if the two strings have the same color (see Figure5A, upper row), proximity is the only grouping cue available, anddifferences between the colors of nearby elements of the samestring (see upper middle and upper right panels in Figure 5A)might even hamper proximity grouping. Our experiment thereforealso addressed the question of whether the proximity groupingprocess can ignore color variations that are not helpful to solve thetask.

Method.Participants. Twenty subjects participated (ages 18–31 years,

16 women). The subjects were healthy volunteers, and reportednormal or corrected-to-normal visual acuity. They were naiveabout the purpose of the experiment, and were paid €7 for theirparticipation in a 1-hr session that included short breaks after everyblock.

Stimuli. Ninety colored circles were placed on imaginarycurves (as in Figure 2A) with a spatial jitter of 2 pixels drawn froma uniform distribution. The circles had a diameter of 0.6° andcolors varied from purple to yellow, traversing the full colorcircle.1 Two markers (circle and star) were shown in white and hada diameter of 1.2°, and the subject’s task was to identify the shapeof the marker on the target string by pressing one of two buttons.We defined three conditions that differed in the degree of similar-ity of the circles of a string. In the global similarity condition (seeFigure 5A, first column), all elements of a string had the samecolor, either yellow or purple. In the semisimilarity condition (seeFigure 5A, second column), the color of the circles changedgradually from one color near the fixation point (yellow or purple)to the opposite color in color space at the end of the string (purpleor yellow). In the local similarity condition (see Figure 5A, thirdcolumn), the circles gradually changed from one color near thefixation point (yellow or purple) to the opposite color in the middleof the string (purple or yellow), and then back to the first color atthe end of the string. The factor within-string similarity wascrossed with a between-strings similarity factor. In the same initial

1 The colors were generated by varying RGB values with constantmaximal saturation and value, thus traversing the color circle (Smith,1978). It follows that they were not isoluminant but varied between 20cd/m2 for purple and 120 cd/m2 for yellow on a grey background of 12cd/m2, and it is therefore possible that luminance similarity contributed tothe grouping process. We were interested in similarity grouping per se(either on the basis of color or luminance), and did not attempt to assess therelative weight of color and luminance in the similarity grouping process.

8 HOUTKAMP AND ROELFSEMA

color condition, the target and the distractor string started with thesame color (either purple or yellow; see Figure 5A, first row). Inthe different initial color condition (see Figure 5A, second row),the target string started with one color (purple or yellow) and thedistractor string started with the opposite color in color space(yellow or purple).

Procedure. The procedure was similar to the previous exper-iments with only a few differences; The three within-string simi-larity conditions were tested in separate blocks of 144 trials, andstimuli of the same and different initial-color conditions werepresented in a randomly interleaved fashion within these blocks.Every subject started with a practice block of 24 trials that wasfollowed by the three similarity conditions. The order of theconditions was counterbalanced across the subjects.

Results. Figure 5B shows the average response times anderror rates as a function of the position of the target marker for thethree conditions. Mean response times on correct trials were ana-

lyzed with a three-way ANOVA with marker position, within-string similarity, and same versus different initial color as factors.Most important, the position of the marker caused a main effect onreaction time, F(2, 38) � 104.0, p � .001, �partial

2 � .845. Thereaction times increased approximately linearly with the distancebetween the fixation point and the marker as measured along thetarget string, and we calculated this slope in ms/degree using alinear regression analysis (see Figure 5B). There was also a maineffect of the initial color, as the response times were significantlyshorter if the two strings started with a different color, F(1, 19) �52.3, p � .001, �partial

2 � .732, but the factor within-string simi-larity did not yield a significant main effect, F(2, 38) � 1.1, p �.3, �partial

2 � .055.The main prediction was that a difference in the initial color

between the two strings and the similarity of colors of the samestring would increase grouping speed: These factors should there-fore interact with the marker position in the ANOVA. The respec-

Figure 5. (A) The stimuli for the global (first column), semi- (second column), and local (third column)similarity conditions of Experiment 2A. In the same initial color condition, the target and distractor string startedwith the same color (upper row). In the different initial color condition (lower row), the target string started withone color and the distractor string with the opposite color in color space. (B) Mean response times acrossparticipants in Experiment 2A as a function of marker position for the three similarity conditions, presentedseparately for the same (solid) and different (dotted) initial color trials. Error bars show 95% confidenceintervals. Grouping speed (in ms/degree) is shown next to the curves. Error percentages are shown as bars onthe x-axis separately for same (solid) and different (striped) initial color trials.

9PARALLEL AND SERIAL GROUPING OF IMAGE ELEMENTS

tive two-way interactions were indeed significant. Grouping speedwas higher if the colors were more homogeneous, F(4, 76) � 8.4,p � .001, �partial

2 � .304, and also if the two strings started with adifferent color, F(2, 38) � 6.8, p � .005, �partial

2 � .264. More-over, a difference in the initial color of the two strings was morebeneficial in the global similarity condition than in the localsimilarity condition, F(2, 38) � 20.1, p � .001, �partial

2 � .511.Finally, the three-way interaction was significant, F(4, 76) � 3.6,p � .05, �partial

2 � .158. If both strings started with a differentcolor, grouping speed was highest in the global similarity condi-tion, and it decreased in the semi- and local similarity conditions.In contrast, if the two strings started with the same color, there washardly any effect of within-string color similarity on groupingspeed.

Discussion. To our knowledge, Experiment 2A is the first todemonstrate that grouping by proximity requires a time-consumingprocess. Proximity relationships are presumably initially only en-coded as base groupings between circles that are direct neighborsof each other. If the task is to establish larger perceptual groups,however, a chain of local groupings has to be combined in atransitive manner, and the time required by this incremental group-ing operation increases approximately linearly with the number ofelements. We further hypothesized that the similarity betweenelements would influence the size of the base groupings andthereby grouping speed. We indeed observed a clear interactionbetween proximity and similarity grouping if the two stringsstarted with a different color. Grouping speed increased if the colorof the circles of one of the strings was homogeneous and differentfrom the color of the other string, suggesting that larger chunkswere added to the evolving perceptual group. In contrast, groupingspeed decreased when the color changed gradually within thestring, a manipulation that presumably decreased the size of thebase groupings.

If the two strings started with the same color, there was hardlyany influence of color homogeneity on grouping speed. In thiscase, elements of the two strings had similar colors so that the taskhad to be solved primarily by proximity grouping. The similargrouping speed for the different homogeneity conditions suggeststhat proximity grouping of circles of the same color occurs at thesame speed as proximity grouping of circles with different colors.In other words, the observers appeared to ignore the colors if theywere not helpful to solve the task.

The results of Experiment 2A are consistent with an incrementalgrouping process that adds base groupings to a gradually evolvingperceptual group with a speed that depends on the size of the basegroupings. The factors that influenced response times in the ex-periment directly influenced the formation of perceptual groups.For example, we observed an increase in grouping speed in theglobal homogeneity condition if the strings started with a differentcolor. This more efficient grouping is inconsistent with a parallelgrouping process that would be followed by a serial evaluation ordecision process. Such an evaluation process should not be influ-enced by the similarity between the strings if it can operate on awell-parsed target string.

Experiment 2B. Serial Similarity Grouping

As a further test of the generality of incremental grouping,Experiment 2B investigated potential processing delays that occur

during grouping on the basis of color similarity. The design wassimilar to that of Experiment 2A, but we now placed additionalcircles with random colors in the background to prevent thegrouping of circles of the target string on the basis of theirproximity (see Figure 6A). We note that the circles of the targetstring could not be identified by relying on color similarity alonebecause a particular color appeared on both strings in some of theconditions and also in the background. The task therefore requiredan interaction between grouping by similarity and proximity: Sim-ilar elements should only be grouped if they are in each other’sproximity.

Method.Participants. The same 20 subjects of Experiment 2A partic-

ipated in this experiment in an additional 1-hr session. The data ofsix subjects had to be excluded from the analysis because theirperformance was at chance for the last target position in the localsimilarity condition (see third column in Figure 6A). The otherparticipants performed above chance level for all target positionsin every condition (chi-square test, p � .05 at all target positions).

Stimuli and procedure. In addition to the 90 colored circles ofthe two strings, we placed randomly colored circles in the back-ground on an imaginary grid. The density of the backgroundcircles was the same as the density of circles on the target anddistractor strings and the position of the background circles wasslightly jittered (within a range of 16 pixels, i.e., 0.4°). Theprocedure was the same as in Experiment 2A.

Results. Figure 6B shows the average response times as afunction of the marker position. The data were analyzed with athree-way ANOVA with marker position, color homogeneity, andsame or different initial color as factors. All three factors had asignificant main effect on response time. First, the response timeincreased with the distance between the marker and the fixationpoint, F(2, 26) � 87.3, p � .001, �partial

2 � .870, which demon-strates that similarity grouping requires serial processing. Theslopes ranged from 11 to 40 ms/degree. The second main effectwas a difference between similarity conditions, F(2, 26) � 24.7,p � .001, �partial

2 � .655, as response times were shortest in theglobal similarity condition, increased in the semisimilarity condi-tion, and were highest in the local similarity condition (all com-parisons: p � .001, Tukey HSD). The third main effect was anincrease in response time in the same initial color condition rela-tive to different initial color condition, F(1, 13) � 32.7, p � .001,�partial

2 � .716.In addition, all the two-way interactions were significant.

Grouping speed increased when the colors of the strings were morehomogeneous, F(4, 52) � 13.2, p � .001, �partial

2 � .504, and if thetwo strings started with a different color, F(2, 26) � 24.9, p �.001, �partial

2 � .657. There also was a significant interactionbetween starting color and color similarity because the beneficialeffect of a difference in starting color was most pronounced in theglobal similarity condition and absent in the local similarity con-dition, F(2, 26) � 12.3, p � .001, �partial

2 � .487.Discussion. Experiment 2B demonstrated unequivocally that

there are conditions where grouping by similarity requires serialprocessing. Grouping speed was lowest in the local conditionwhere the color of the circles changed within the string and highestin the global condition where all the elements of a string had thesame color. Larger chunks of image elements with a similar colorcan be detected as base groupings in the global condition so that

10 HOUTKAMP AND ROELFSEMA

fewer of these base groupings have to be combined to arrive at acertain position within the string. Furthermore, grouping speedwas highest if the two strings started with a different color. Thisbeneficial effect of target–distractor dissimilarity was absent in thelocal condition, but it was particularly pronounced in the globalsimilarity condition where the colors of the target and distractorstrings were entirely different. In this situation, the target stringcould be distinguished from the distractor string as well as fromthe background elements on the basis of its unique color. Never-theless, even in this condition, grouping required serial processing.

A comparison between Figures 5B and 6B reveals that theadditional background elements of Experiment 2B caused a gen-eral increase in response time.2 In Experiment 2A, the target stringhad to be segregated only from the distractor string, but in Exper-iment 2B, it had to be segregated from the background elementstoo. Therefore, in Experiment 2A, proximity alone could be usedfor grouping, whereas proximity and similarity cues had to becombined in Experiment 2B. The fact that color similarity was

2 To analyze the effect of adding a background of randomly colored ele-ments, we compared response times in the 14 participants who performedabove chance level for all target positions in every condition in Experiment 2B(with background) to their response times in Experiment 2A (without back-ground). We carried out an ANOVA with four factors: marker position, colorhomogeneity, same or different initial color, and the presence of backgroundelements. We obtained a main effect of background, F(1, 13) � 78.6, p �.001, �partial

2 � .858, indicating that the background elements delayed theresponses. Furthermore, there was a stronger effect of within-string colorhomogeneity in the background condition, as reflected by a significant three-way interaction between marker position, homogeneity, and background, F(4,52) � 6.06, p � .001, �partial

2 � .318. Finally, the effect of a difference instarting color of the strings was more pronounced in the presence of back-ground elements, as indicated by a significant interaction between markerposition, background, and same or different initial color, F(2, 26) � 9.67, p �.002, �partial

2 � .427.

Figure 6. (A) Stimuli of Experiment 2B where randomly colored circles were present in the background. Theglobal, semi-, and local similarity conditions are presented in the first, second, and third columns, respectively.The upper row presents the same initial color condition and the lower row the different initial color condition.(B) Mean response times in Experiment 2B as a function of marker position for the global, semi-, and localsimilarity conditions. Solid lines show same, and dotted lines different initial color trials. Error bars represent95% confidence intervals. The slope in ms/degree is shown next to the lines. Error rate is shown on the x-axisfor same (solid) and different (striped) initial color trials.

11PARALLEL AND SERIAL GROUPING OF IMAGE ELEMENTS

crucial for grouping in Experiment 2B presumably also amplifiedthe beneficial effect of within-string color homogeneity and theeffect of a difference in color of the two strings on grouping speed.

Experiment 3. Only a Single Incremental Groupat a Time

The results described so far demonstrate that perceptual group-ing by good continuation, common fate, proximity, and similarityrequires under some conditions a serial, time-consuming processthat we call incremental grouping. Our data provide evidenceagainst theories claiming that perceptual grouping always occursrapidly and in parallel, although Experiments 1A and 1B showedthat there are also conditions where grouping can occur by aparallel process that we call base grouping. The seriality of group-ing in most of the conditions studied by us is in accordance withtheories proposing that in some cases perceptual grouping requiresa serial labeling operation. Two distinct neuronal labels for incre-mental grouping have been proposed; synchrony and an enhance-ment of neuronal firing rates. One of the proclaimed advantages ofsynchrony is that it permits the formation of multiple incrementalgroups at the same time (Behrmann et al., 1998; Engel et al., 1992;Singer & Gray, 1995; Sporns et al., 1989). Every group of imageelements can be labeled with a unique temporal pattern of neuronalactivity so that the responses of neurons within a group are

synchronous, whereas neurons that respond to nongrouped imageelements fire independently (see Figure 1B). The label of anenhanced neuronal firing rate, in contrast, is usually believed topermit only a single incremental group at a time. The neuronalresponse evoked by an image element is either enhanced or it is not(Roelfsema, 2006). Experiment 3 was designed to distinguishbetween one- and multiple-label theories by investigating whetherincremental grouping is possible for multiple perceptual objects atthe same time.

We employed a contour grouping task where subjects had togroup contour elements on the basis of their colinearity and con-nectedness to detect a target curve with two circles on both endsamong distractors with a circle on one end (see Figures 7A and7B). To investigate whether incremental grouping of contour ele-ments of different objects can occur in parallel, we varied thenumber of curves (see Figure 7C). In the single pair condition, wepresented two curves to the left or right of fixation; in the doublepair condition, we presented a total of four curves, two on eitherside of the fixation point. We showed previously that intersectionsbetween the curves slow down perceptual grouping, with everyintersection increasing response time by approximately 100 ms(Houtkamp et al., 2003; Roelfsema et al., 1999). Serial and parallelmodels make different predictions of how the intersections be-tween two pairs of curves interact to determine the overall re-sponse time. If, for example, a target pair is combined with adistractor pair, parallel models predict that response time shoulddepend only on the number of intersections between the curves ofthe target pair. In contrast, serial models predict that intersectionsbetween curves of the distractor pair also prolong response timebecause subjects will start with the distractor pair on half of thetrials, and on those trials they have to complete grouping for thedistractor pair before they proceed to group contour elements ofthe target pair.

Method

Participants. Eight subjects (ages 21–28 years, two men)participated in Experiment 3. One was an author; the others werenaive about the purpose of the experiment. They were paid €8 toparticipate in a 1-hr session including short breaks between blocks.All reported normal or corrected-to-normal visual acuity.

Stimuli. Examples of the stimuli are shown in Figures 7A and7B. The stimuli consisted of a pair of curves that intersected eachother zero, one, or two times. In the case of one intersection, thiscould be either at the upper or at the lower part of the stimulus.Two black circles were present on the curves as markers. Themarkers were on different curves for a distractor pair (see Figure7B), whereas they were on the same curve for a target pair (seeFigure 7A). We presented the stimuli shown in Figure 7 as well astheir mirror images. A pair of curves had a height of 4.3° and awidth of 1.8°. The curves themselves had a width of 8 pixels andwere shown in magenta (with a luminance of 40 cd/m2) or cyan(55 cd/m2) on a grey background (90 cd/m2). On a given trial, allcurves had the same color. The center of the stimuli was located1.4° to the left or right of a small fixation cross (see Figure 7C,drawn to scale).

Procedure. There were two conditions. In the single paircondition, a single pair of curves was presented to the left or rightof fixation; in the double pair condition, two pairs of curves were

Figure 7. Stimuli of Experiment 3. (A) One of the curves of a target pairhad a circle on both of its ends. Numbers below stimuli correspond to thenumber of intersections. (B) Curves of a distractor pair were connected toa single circle. (C, left) In the single pair condition, a pair of curves wasshown to the left or right of fixation (small plus symbol). (C, right) In thedouble pair condition, two pairs of curves were shown, one to the left andthe other to the right of fixation.

12 HOUTKAMP AND ROELFSEMA

presented. After a practice block of 50 trials for both conditions,subjects were tested in six alternating blocks of 128 trials. Theorder of the blocks was counterbalanced across subjects. On abutton press, a central fixation cross appeared that remained visibleduring the trial. After 500 ms, the curves were presented. Thesubject decided by button press as fast and accurately as possiblewhether there was a target curve with circles on both ends (target-present trials, 50%). On target-absent trials in the single anddouble pair conditions, the subjects saw one or two distractor pairs(the other 50% of trials) and pressed the other button. When theymade an error, subjects heard a beep. They were told to fixate thecentral cross throughout the trial in order to achieve the fastestpossible response times.

Results and Discussion

Figure 8 shows how error rates and response times depend onthe number of intersections in the single pair condition for target-present (A) and target-absent trials (B). Performance decreasedwith the number of intersections, whereas the average increase inresponse time caused by an intersection was 120 ms, a value closeto that observed in previous studies (Houtkamp et al., 2003;Roelfsema et al. 1999; Scholte et al., 2001). Figures 8C and 8Dshow the error rates and response times in the double pair condi-

tion. Performance did not systematically vary with the number ofintersections.

For target-present trials, a multilabel model predicts that re-sponse time is independent of the number of intersections betweenthe curves of the distractor pair, and the corresponding data pointsare connected with lines in Figure 8C. The data are not consistentwith such a model, however, because the intersections betweencurves of the distractor pair increased the response time (seeFigure 8C). We investigated the significance of these effects withtwo-way ANOVAs with the number of intersections in the dis-tractor pair and subject as factors. Intersections of the distractorpair had a significant effect on the response time for the stimuliwithout target pair intersections, F(2, 19) � 5.0, p � .05, �partial

2 �.348; for the stimuli with one intersection of the target pair, F(2,16) � 4.1, p � .05, �partial

2 � .341; as well as for the stimuli withtwo intersections of the target pair, F(2, 17) � 3.9, p � .05,�partial

2 � .317.Parallel and serial models also make different predictions about

how intersections between the curves of two distractor pairs inter-act to determine response time. A serial model predicts that all theintersections cause a comparable increase in response time becauseboth pairs of curves have to be processed before the subject cangive a target-absent response. In contrast, a parallel model holds

Figure 8. Mean response times and error rates across subjects in Experiment 3. (A, B) Average response timesin target-present (A) and target-absent trials (B) of the single pair condition as function of the number ofintersections. (C) Response times on target-present trials of the double pair condition. Numbers below theabscissa indicate the number of intersections in the target (first number) and distractor pair (second number). (D)Average response times on target-absent trials of the double pair condition. Numbers below the abscissa indicatethe number of intersections between the curves of the two pairs. Bars on the x-axis show the percentage of errors.

13PARALLEL AND SERIAL GROUPING OF IMAGE ELEMENTS

that response time depends only on the pair of curves with mostintersections because the grouping of the other pair of curves canoccur in parallel and finishes earlier. We therefore tested whetherthe response time differed between stimuli where one pair had oneintersection and the other pair none and where both pairs had asingle intersection (see 0-1 and 1-1 in Figure 8D), and observed asignificant difference between these two conditions, F(1, 8) �12.0, p � .01, �partial

2 � .601. We next tested whether there was adifference between the conditions where one pair had two inter-sections and the other pair zero, one, or two intersections andobserved, again, that response times differed significantly betweenthe conditions, F(2, 15) � 4.4, p � .05, �partial

2 � .379. The resultsare not in accordance with the multilabel model that permitsconcurrent labeling operations, but they are consistent with theone-label model that proposes that incremental grouping occurs foronly one object at a time.

General Discussion

Here, we have measured the properties of the Gestalt groupingprocess to distinguish between theories of perceptual grouping:grouping by cardinal cells and grouping by the gradual spread ofsynchrony or attention. In Experiment 1, we tested grouping bygood continuation and common fate to explore conditions wheregrouping can occur in parallel and conditions where it requiresserial processing. We found that parallel grouping is possible if thetask is to detect a coherent target object on a background ofincoherent distractor elements. Perceptual grouping becomes se-rial, however, when a target object is accompanied by an equallycoherent distractor object. In this situation, local computationscannot determine the elements that belong together, and instead anumber of local groupings have to be combined in a chain todetermine the overall configuration of the target object. Experi-ment 2 generalized the serial process to similarity and proximitygrouping, and demonstrated that the speed of incremental groupingincreases if the elements of the target object are similar anddissimilar from the distractor elements. This effect of target–distractor similarity excluded the possibility that the groupingtakes place in parallel, and a postgrouping decision process isresponsible for the increase in reaction time with the length of theobject. If a parallel grouping process had been sufficient to delin-eate all the elements of a target object, then the properties of thedistractor curve should not have influenced response times. Ex-periment 3 tested whether multiple incremental groupings canform concurrently, and we found clear evidence that incrementalgrouping occurs for only one object at a time. We now discuss theconsequences of these new findings for theories of grouping andsuggest that our results combined with the results from previousstudies inspire a new theory of grouping that we call the IGT.

Base Grouping and Incremental Grouping

We delineated conditions where grouping occurs in parallel andother conditions where it requires serial processing. The IGTproposes that there are two mechanisms for perceptual grouping.The parallel process is called base grouping (cf. Ullman, 1984),and we suggest that it is implemented in the visual cortex bycardinal cells, neurons that are tuned to specific configurations ofimage elements that can be extracted efficiently and in parallel

across the visual scene (see Figure 1A). Neurons tuned to simplecontour configurations are indeed observed in the visual cortex(Brincat & Connor, 2004; Pasupathy & Connor, 2001), and theyare rapidly activated after stimulus presentation (Kreiman, Poggio,& DiCarlo, 2005; Oram & Perrett, 1992; Sugase, Yamane, Ueno,& Kawano, 1999). Consistent with this idea, our first experimentdemonstrated that local base groupings of a number of imageelements are indeed detected efficiently if they are embedded in abackground of incoherent image elements (see Figure 3B), even ifsuch base groupings are only represented in higher visual areas (asin Experiment 1B). Thus, our data support the idea of base group-ing, but they provide at the same time evidence against the view(Ghose & Maunsell, 1999; Riesenhuber & Poggio, 1999; Tovee,1994) that these base groupings solve all perceptual groupingtasks.

If the task cannot be solved by base grouping (e.g., becausethere are no neurons tuned to the critical feature conjunctions), theIGT proposes that a second, serial grouping process comes intoplay. We found that grouping by good continuation, common fate,proximity, and similarity indeed requires serial processing if thereare multiple, equally coherent objects in the display. These objectscannot be distinguished from each other on the basis of localcomputations, and a chain of local groupings has to be evaluatedin a transitive manner (comparable to the application of a visualroutine; Ullman, 1984; Roelfsema, 2005). In these situations, theresponse time increases linearly with the number of image ele-ments that need to be grouped together. It has been suggested inthe visual search literature that the increase in the search time thatoccurs if more distractors are added to the display can be modeledas a serial process, but also as a parallel process with limitedcapacity (Townsend, 1971, 1990; Verghese, 2001). Is the linearincrease in response time of the present study also consistent withsuch a capacity-limited parallel process? We believe not. In aprevious study, we showed that attention spreads gradually alongthe target curve in the curve-tracing task, from the start of thecurve until the location of the marker is reached (Houtkamp et al.,2003). It is hard to see how such a regular spread of attention alongthe target curve or target string could be modeled as a parallelprocess with limited capacity.

Experiment 2 tested the interactions between base and incre-mental grouping and exposed two factors that influence groupingspeed. First, a high degree of within-object similarity that increasesthe size of the base grouping indeed promotes fast grouping (seealso Mahoney & Ullman, 1988). Second, processing time dependson the similarity between the target object and other objects in itsvicinity. Grouping speed decreases if the target object has the samecolor as a distractor object.

The Identity of the Label for Incremental Grouping

To our knowledge, the only theories that are consistent with aserial, incremental grouping process propose that there is a labelthat spreads along the low-level representations of image elementsof a perceptual group. The advantage of such a labeling process isthat new constellations of features that are related to each other byGestalt cues can be represented flexibly without the necessity ofcreating new cardinal cells (Singer & Gray, 1995). Two labelshave been proposed for binding: synchrony and an enhancement ofneuronal firing rates (see Figures 1B and 1C).

14 HOUTKAMP AND ROELFSEMA

The theoretical advantage of synchrony over the activity en-hancement is that multiple objects can be labeled with distincttemporal patterns of neuronal discharges, permitting the coexist-ence of multiple incremental groups. We tested this prediction inExperiment 3 but obtained evidence that incremental groupingoccurs for only a single object at a time. This finding is in line withneurophysiological results because recent studies that investigatedneuronal synchrony during perceptual grouping tasks in behavinganimals (Roelfsema et al., 2004; Thiele & Stoner, 2003) did notconfirm the relationship between synchrony and binding that hadbeen inferred from earlier work in anesthetized animals (Engel,1992; Singer & Gray, 1995). In contrast, these studies supportedthe idea that image elements grouped in perception are labeled byan enhanced neuronal response (Roelfsema, 2006; Roelfsema etal., 1998). Apparently, the incremental grouping process is imple-mented as the spread of an enhanced response through lateral andfeedback connections in the cortex linking neurons that representthe image elements of a single perceptual object. The neuronalresponse enhancement is usually believed to correspond to selec-tive attention at the psychological level of description (reviewedby Desimone & Duncan, 1995; Lamme & Roelfsema, 2000). Andindeed, in a curve-tracing task, attention gradually spreads over thetarget curve until all contour elements of this curve are labeledwith attention (Houtkamp et al., 2003).

The FIT of Treisman and Gelade (1980) was the first to proposea role for attention in binding. Whereas a modified version of thetheory (Treisman, 1996) also delineated the problem of part bind-ing (i.e., the binding of parts of an object and segregating themfrom the background), the theory did not envision the possibility ofcreating perceptual objects by gradually spreading attention acrossa set of related but spatially separate features. The FIT proposedthat focal attention is required for binding different features, suchas a color and a shape, if present at the same spatial location. Thenewly proposed IGT rather argues that these local feature conjunc-tions can often be detected as base groupings, in accordance withneurophysiological findings that many neurons are selective forlocal feature conjunctions (Roelfsema, 2006). In contrast, we pro-pose that attention is usually required to evaluate Gestalt groupingcues for binding features at different locations. The present datashow that Gestalt grouping takes time, and our previous experi-ments showed that object-based attention gradually spreads overthe image elements that need to be grouped (Houtkamp et al.,2003). It is therefore likely that the representations of nearbyimage elements, colinear contours, items moving in the samedirection, and items with a similar color are only locally linked inthe visual cortex, and object-based attention has to spread throughthese links to label all image elements of a perceptual object (seeFigure 9).

Resolution of Conflicts in the PerceptualGrouping Literature

The present proposal of two mechanisms for grouping mayresolve a number of apparent conflicts in the literature. There aremany studies that obtained evidence for a parallel preattentiveGestalt grouping process but there are also clear demonstrations ofa serial grouping process that requires attention.

Previous contour grouping studies (Jolicoeur et al., 1986, 1991;Roelfsema et al., 1999) as well as the results of the present study

establish situations where Gestalt grouping requires a serial,attention-demanding process. If incremental grouping requires thespread of object-based attention, it should be hampered if attentionis directed elsewhere. Such an interference has indeed been ob-served by Ben-Av et al. (1992), who investigated the grouping ofimage elements surrounding a centrally displayed letter. If theparticipants had to discriminate the central letter, they were unableto report the grouping of the surrounding image elements into rowsor columns on the basis of proximity or similarity. This resultsuggests that grouping by proximity and similarity did not occurwithout attention.

However, there are also a number of studies that have reportedparallel grouping outside the focus of attention (Kimchi &Razpurker-Apfeld, 2004; Moore & Egeth, 1997; Russell & Driver,2005). In the studies by Kimchi and Razpurker-Apfeld (2004) andRussell and Driver (2005), for example, participants carried out achange detection task by comparing two subsequently presentedarrays in central vision. There were also image elements in thesurround, and unbeknownst to the subjects, these elements formedcolumns or rows on the basis of (isoluminant) color similarity. Ifboth the central pattern and the grouping of background elementsinto row or columns changed across displays, the participants weremore likely to report the change than when only the centralconfiguration changed while the peripheral grouping stayed thesame. Thus, the peripheral groupings were registered although thesubjects did not even become aware of these groupings, and theseresults therefore demonstrate a form of perceptual grouping with-out attention. The distinction between base and incremental group-ing may offer reconciliation between the discrepant findings as thechange detection studies may actually have tested base grouping.There are many neurons in the visual cortex that are selective forthe orientation of isoluminant gratings (Gegenfurtner, Kiper, &Levitt, 1997), and the activity of these neurons may have exertedan unconscious biasing effect on the change detection task that wascarried out in central vision. The availability of the required basegroupings to solve a task may depend on low-level image proper-ties. Elder and Zucker (1993, 1994), for example, demonstratedthat the efficiency of visual search depends on contour closure andcontrast polarity. They found that search for closed shapes withcontours of the same contrast polarity is fast and efficient, whereasthe search for shapes with fragmented contours of varying contrastpolarity is serial and slow (see also Gilchrist et al., 1997, for a

Figure 9. Incremental grouping by the spread of object-based attention.Image elements of an elongated object are proposed to be grouped by thespread of attention (grey) from attended image elements to nearby imageelements that are related to them. The representations of adjacent imageelements with a similar color, motion, or in good continuation are linkedlocally (black lines).

15PARALLEL AND SERIAL GROUPING OF IMAGE ELEMENTS

related finding). This can be explained if simple closed shapes withcontours of the same contrast polarity give rise to base groupings,whereas fragmented, nonclosed shapes formed of contours withopposite contrast polarities do not and thus require incrementalgrouping.

The distinction between base and incremental grouping is alsosupported by an elegant study by Holcombe and Cavanagh (2001)in which subjects were asked to report the orientation of a coloredgrating that was alternated rapidly with an orthogonal grating withanother color. For example, the subject would see a red leftward-tilted grating that alternated with a green rightward-tilted grating.The authors measured the maximal frequency of alternation atwhich the color of, say, the rightward-tilted grating could bereported. Subjects could do this at the remarkably short exposureduration of 30 ms per alternation. However, when the colors andorientations were separated in space by presenting them in adja-cent locations, the minimal exposure duration increased dramati-cally, to 200 ms. We propose that these results can be explained bythe distinction between base grouping and incremental grouping.Features at one location can be extracted by neurons tuned to therelevant feature conjunctions as base groupings, but features atdifferent locations have to be grouped incrementally, which takesmore time.

In the present series of experiments, we showed that verypronounced time delays occur in tasks that require the evaluationof chains of groupings. In these situations, the processing timeincreases linearly with the number of to-be-grouped image ele-ments. Future studies can now start to evaluate whether incremen-tal grouping occurs in everyday scenes where parts of an object areoften linked through chains of local groupings that need to becombined in a transitive manner.

References

Barlow, H. B. (1972). Single units and sensation: A neuron doctrine forperceptual psychology? Perception, 1, 371–394.

Behrmann, M., Zemel, R. S., & Mozer, M. C. (1998). Object-basedattention and occlusion: Evidence from normal participants and a com-putational model. Journal of Experimental Psychology: Human Percep-tion and Performance, 24, 1011–1036.

Ben-Av, M. B., Sagi, D., & Braun, J. (1992). Visual attention and percep-tual grouping. Perception & Psychophysics, 52, 277–294.

Bergen, J. R., & Julesz, B. (1983, June 23). Parallel versus serial process-ing in rapid pattern discrimination. Nature, 303, 696–698.

Brincat, S. L., & Connor, C. E. (2004). Underlying principles of visualshape selectivity in posterior inferotemporal cortex. Nature Neuro-science, 7, 880–886.

Brincat, S. L., & Connor, C. E. (2006). Dynamic shape synthesis inposterior inferotemporal cortex. Neuron, 49, 17–24.

Davis, G., & Driver, J. (1994, October 27). Parallel detection of Kanizsasubjective figures in the human visual system. Nature, 371, 791–793.

Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visualattention. Annual Reviews in Neuroscience, 18, 193–222.

Elder, J., & Zucker, S. (1993). The effect of contour closure on the rapiddiscrimination of two-dimensional shapes. Vision Research, 33, 981–991.

Elder, J., & Zucker, S. (1994). A measure of closure. Vision Research, 34,3361–3369.

Engel, A. K., König, P., Kreiter, A. K., Schillen, T. B., & Singer, W.(1992). Temporal coding in the visual cortex: New vistas on integrationin the nervous system. Trends in Neuroscience, 15, 218–226.

Field, D. J., Hayes, A., & Hess, R. F. (1993). Contour integration by thehuman visual system: Evidence for a local “association field.” VisionResearch, 33, 173–193.

Gegenfurtner, K. R., Kiper, D. C., & Levitt, J. B. (1997). Functionalproperties of neurons in macaque area V3. Journal of Neurophysiology,77, 1906–1923.

Ghose, G. M., & Maunsell, J. H. R. (1999). Specialized representations invisual cortex: A role for binding? Neuron, 24, 79–85.

Gigus, Z., & Malik, J. (1991). Detecting curvilinear structure in images(Technical Report No. 91/619). Berkeley: University of California.

Gilchrist, I. D., Humphreys, G. W., Riddoch, M. J., & Neumann, H. (1997).Luminance and edge information in grouping: A study using visualsearch. Journal of Experimental Psychology: Human Perception andPerformance, 23, 464–480.

Grossberg, S., & Raizada, R. D. S. (2000). Contrast-sensitive perceptualgrouping and object-based attention in the laminar circuits of primaryvisual cortex. Vision Research, 40, 1413–1432.

Holcombe, A. O., & Cavanagh, P. (2001). Early binding of feature pairs forvisual perception. Nature Neuroscience, 4, 127–128.

Houtkamp, R., Spekreijse, H., & Roelfsema, P. R. (2003). A gradual spreadof attention during mental curve tracing. Perception & Psychophysics,65, 1145–1160.

Jolicoeur, P., Ullman, S., & MacKay, M. (1986). Curve tracing: A possiblebasic operation in the perception of spatial relations. Memory & Cog-nition, 14, 129–140.

Jolicoeur, P., Ullman, S., & MacKay, M. (1991). Visual curve tracingproperties. Journal of Experimental Psychology: Human Perception andPerformance, 17, 997–1022.

Julesz, B. (1981, March 12). Textons, the elements of texture perception,and their interactions. Nature, 290, 91–97.

Kimchi, R., & Razpurker-Apfeld, I. (2004). Perceptual grouping andattention: Not all groupings are equal. Psychonomic Bulletin & Review,11, 687–696.

Koffka, K. (1935). Principles of Gestalt psychology. New York: HarcourtBrace.

Kovacs, I., & Julesz, B. (1993). A closed curve is much more than anincomplete one: Effect of closure in figure-ground segmentation. Pro-ceedings of the National Academy of Sciences, USA, 90, 7495–7497.

Kreiman, G., Poggio, T., & DiCarlo, J. J. (2005, November 4). Fast readoutof object identity from macaque inferior temporal cortex. Science, 310,863–866.

Kubovy, M., Holcombe, A. O., & Wagemans, J. (1998). On the lawfulnessof grouping by proximity. Cognitive Psychology, 35, 71–98.

Lamme, V. A. F., & Roelfsema, P. R. (2000). The distinct modes of visionoffered by feedforward and recurrent processing. Trends in CognitiveSciences, 23, 571–579.

Mahoney, J. V., & Ullman, S. (1988). Image chunking defining spatialbuilding blocks for scene analysis. In Z. Pylyshyn (Ed.), Computationalprocesses in human vision: An interdisciplinary perspective (pp. 169–209). Norwood, NJ: Ablex.

Moore, C. M., & Egeth, H. (1997). Perception without attention: Evidenceof grouping under conditions of inattention. Journal of ExperimentalPsychology: Human Perception and Performance, 23, 339–352.

Neisser, U. (1967), Cognitive psychology. New York: Appleton-Century-Crofts.

Oram, M. W., & Perrett, D. I. (1992). Time course of neural responsesdiscriminating different views of the face and head. Journal of Neuro-physiology, 68, 70–84.

Palanca, B. J. A., & DeAngelis, G. C. (2005). Does neuronal synchronyunderlie visual feature grouping? Neuron, 46, 333–346.

Pasupathy, A., & Connor, C. E. (2001). Shape representation in area V4:Position-specific tuning for boundary conformation. Journal of Neuro-physiology, 86, 2505–2519.

Pringle, R., & Egeth, H. E. (1988). Mental curve tracing with elementary

16 HOUTKAMP AND ROELFSEMA

stimuli. Journal of Experimental Psychology: Human Perception andPerformance, 14, 716–728.

Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of objectrecognition in cortex. Nature Neuroscience, 2, 1019–1025.

Rock, I., & Palmer, S. (1990). The legacy of Gestalt psychology. ScientificAmerican, 263, 48–61.

Roelfsema, P. R. (2005). Elemental operations in vision. Trends in Cog-nitive Sciences, 9, 226–233.

Roelfsema, P. R. (2006). Cortical algorithms for perceptual grouping.Annual Reviews in Neuroscience, 29, 203–227.

Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (1998, September24). Object-based attention in the primary visual cortex of the macaquemonkey. Nature, 395, 376–381.

Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (2004). Synchronyand covariation of firing rates in the primary visual cortex duringcontour grouping. Nature Neuroscience, 7, 982–991.

Roelfsema, P. R., Scholte, H. S., & Spekreijse, H. (1999). Temporalconstraints on the grouping of contour segments into spatially extendedobjects. Vision Research, 39, 1509–1529.

Russell, C., & Driver, J. (2005). New indirect measures of “inattentive”visual grouping in a change-detection task. Perception & Psychophysics,67, 606–623.

Scholte, H. S., Spekreijse, H., & Roelfsema, P. R. (2001). The spatialprofile of visual attention in mental curve tracing. Vision Research, 41,2569–2580.

Sha’ashua, A., & Ullman, S. (1988). Structural saliency: The detection ofglobally salient structures using a locally connected network. In Pro-ceedings of the 2nd International Conference on Computer Vision (pp.321–327). Washington, DC: IEEE Computer Society Press.

Shadlen, M. N., & Movshon, J. A. (1999). Synchrony unbound: A criticalevaluation of the temporal binding hypothesis. Neuron, 24, 67–77.

Singer, W., & Gray, C. M. (1995). Visual feature integration and thetemporal correlation hypothesis. Annual Reviews in Neuroscience, 18,555–586.

Smith, A. R. (1978). Color gamut transform pairs. In Proceedings of the5th annual conference on computer graphics and interactive techniques(pp. 12–19). ACM: New York.

Spillmann, L., & Werner, J. S. (1996). Long-range interactions in visualperception. Trends in Neurosciences, 19, 428–434.

Sporns, O., Gally, J. A., Reeke, G. N., & Edelman, G. M. (1989). Reentrant

signaling among simulated neuronal groups leads to coherency in theiroscillatory activity. Proceedings of the National Academy of Sciences,USA, 86, 7265–7269.

Sugase, Y., Yamane, S., Ueno, S., & Kawano, K. (1999, August 26).Global and fine information coded by single neurons in the temporalvisual cortex. Nature, 400, 869–873.

Thiele, A., & Stoner, G. R. (2003, January 23). Neuronal synchrony doesnot correlate with motion coherence in cortical area MT. Nature, 421,366–370.

Tovee, M. J. (1994). How fast is the speed of thought? Current Biology, 4,1125–1127.

Townsend, J. T. (1971). A note on the identifiability of parallel and serialprocesses. Perception & Psychophysics, 10, 161–163.

Townsend, J. T. (1990). Serial vs. parallel processing: Sometimes they looklike Tweedledum and Tweedledee but they can (and should) be distin-guished. Psychological Science, 1, 46–54.

Treisman, A. M. (1982). Perceptual grouping and attention in visual searchfor features and for objects. Journal of Experimental Psychology: Hu-man Perception and Performance, 8, 194–214.

Treisman, A. M. (1996). The binding problem. Current Opinion in Neu-robiology, 6, 171–178.

Treisman, A. M., & Gelade, G. (1980). A feature-integration theory ofattention. Cognitive Psychology, 12, 97–136.

Treisman, A. M., & Gormican, S. (1988). Feature analysis in early vision:Evidence from search asymmetries. Psychological Review, 95, 15–48.

Ullman, S. (1984). Visual routines. Cognition, 18, 97–159.Verghese, P. (2001). Visual search and attention: A signal detection theory

approach. Neuron, 31, 523–535.Von der Malsburg, C. (1999). The what and why of binding: The modeler’s

perspective. Neuron, 24, 95–104.Wertheimer, M. (1923). Untersuchungen zur Lehre von der Gestalt II.

[Studies in the theory of Gestalt psychology]. Psychologische For-schung, 4, 301–350.

Wolfe, J. M., & Bennett, S. C. (1997). Preattentive object files: Shapelessbundles of basic features. Vision Research, 37, 25–43.

Received April 16, 2009Revision received February 4, 2010

Accepted February 8, 2010 �

17PARALLEL AND SERIAL GROUPING OF IMAGE ELEMENTS