Attention Issues in Spatial Information Systems: Directing Mobile Users Visual Attention Using...

22
Journal of Management Information Systems / Spring 2007, Vol. 23, No. 4, pp. 163–184. © 2007 M.E. Sharpe, Inc. 0742–1222 / 2007 $9.50 + 0.00. DOI 10.2753/MIS0742-1222230408 Attention Issues in Spatial Information Systems: Directing Mobile Users’ Visual Attention Using Augmented Reality FRANK BIOCCA, CHARLES OWEN, ARTHUR TANG, AND COREY BOHIL FRANK BIOCCA is AT&T Chaired Professor of Telecommunication, Information Stud- ies, and Media at Michigan State University and at the Center for Knowledge and Innovation Research, Helsinki School of Economics. His research interests focus on human–computer interaction, specifically interfaces that augment individual and group cognition. He is the founder and director of the M.I.N.D. Labs, a collaborative network of 11 labs in seven countries. CHARLES OWEN is an Associate Professor in the Department of Computer Science and Engineering at Michigan State University. He is the Director of the Media and Entertainment Technologies Laboratory. Dr. Owen conducts research in augmented reality, computer graphics, and multimedia. ARTHUR TANG is an Assistant Professor in the Department of Industrial Engineering and Management Systems at the University of Central Florida. He is the associate director of the M.I.N.D. Lab at the University of Central Florida. His research interests include human factors in augmented reality and virtual reality, cognitive psychology in computer interface, experimental evaluation of computer interfaces, and computer- mediated communication. COREY BOHIL is a Postdoctoral Fellow and Lab Manager at Michigan State University’s M.I.N.D. Lab. He is a cognitive psychologist with interests in human–computer inter- action, perceptual classification, perception and action, and cognitive modeling. ABSTRACT: Knowledge of objects, situations, or locations in the environment can be pro- ductive, useful, or even life-critical for mobile augmented reality (AR) users. Users may need assistance with (1) dangers, obstacles, or situations requiring attention; (2) visual search; (3) task sequencing; and (4) spatial navigation. The omnidirectional attention funnel is a general purpose AR interface technique that rapidly guides attention to any tracked object, person, or place in the space. The attention funnel dynamically directs user attention with strong bottom-up spatial attention cues. In a study comparing the attention funnel to other attentional techniques such as highlighting and audio cueing, search speed increased by over 50 percent, and perceived cognitive load decreased by 18 percent. The technique is a general three-dimensional cursor in a wide array of applications requiring visual search, emergency warning, and alerts to specific objects or obstacles, or for three-dimensional navigation to objects in space. KEY WORDS AND PHASES: augmented reality, geospatial information system, location- based services, mobile computing, spatial information systems, visual attention.

Transcript of Attention Issues in Spatial Information Systems: Directing Mobile Users Visual Attention Using...

Journal of Management Information Systems / Spring 2007, Vol. 23, No. 4, pp. 163–184.

© 2007 M.E. Sharpe, Inc.

0742–1222 / 2007 $9.50 + 0.00.

DOI 10.2753/MIS0742-1222230408

Attention Issues in Spatial Information Systems: Directing Mobile Users’ Visual Attention Using Augmented Reality

FRANK BIOCCA, CHARLES OWEN, ARTHUR TANG, AND COREY BOHIL

FRANK BIOCCA is AT&T Chaired Professor of Telecommunication, Information Stud-ies, and Media at Michigan State University and at the Center for Knowledge and Innovation Research, Helsinki School of Economics. His research interests focus on human–computer interaction, specifically interfaces that augment individual and group cognition. He is the founder and director of the M.I.N.D. Labs, a collaborative network of 11 labs in seven countries.

CHARLES OWEN is an Associate Professor in the Department of Computer Science and Engineering at Michigan State University. He is the Director of the Media and Entertainment Technologies Laboratory. Dr. Owen conducts research in augmented reality, computer graphics, and multimedia.

ARTHUR TANG is an Assistant Professor in the Department of Industrial Engineering and Management Systems at the University of Central Florida. He is the associate director of the M.I.N.D. Lab at the University of Central Florida. His research interests include human factors in augmented reality and virtual reality, cognitive psychology in computer interface, experimental evaluation of computer interfaces, and computer-mediated communication.

COREY BOHIL is a Postdoctoral Fellow and Lab Manager at Michigan State University’s M.I.N.D. Lab. He is a cognitive psychologist with interests in human–computer inter-action, perceptual classification, perception and action, and cognitive modeling.

ABSTRACT: Knowledge of objects, situations, or locations in the environment can be pro-ductive, useful, or even life-critical for mobile augmented reality (AR) users. Users may need assistance with (1) dangers, obstacles, or situations requiring attention; (2) visual search; (3) task sequencing; and (4) spatial navigation. The omnidirectional attention funnel is a general purpose AR interface technique that rapidly guides attention to any tracked object, person, or place in the space. The attention funnel dynamically directs user attention with strong bottom-up spatial attention cues. In a study comparing the attention funnel to other attentional techniques such as highlighting and audio cueing, search speed increased by over 50 percent, and perceived cognitive load decreased by 18 percent. The technique is a general three-dimensional cursor in a wide array of applications requiring visual search, emergency warning, and alerts to specific objects or obstacles, or for three-dimensional navigation to objects in space.

KEY WORDS AND PHASES: augmented reality, geospatial information system, location-based services, mobile computing, spatial information systems, visual attention.

164 BIOCCA, OWEN, TANG, AND BOHIL

The Use of Mobile Systems in the Management of Information and Objects

WITH THE EVOLUTION OF MOBILE COMPUTER SYSTEMS, there is a tighter and more ubiquitous integration of the virtual information space with physical space. For example, the use of databases marked by geospatial data or radio frequency identification (RFID) tagging and mobile displays enable potential integration of virtual information and physical assets—the two are dynamically linked. Locations, such as buildings or rooms, and objects, such as packages, vehicles, or tools, are often linked to arrays of information in databases. But interfaces are still emerging that allow mobile users to efficiently and fully use this information on-site for navigation, team coordination, object loca-tion, and object retrieval. Of current interfaces, the most suited to mobile geospatial information display is augmented reality (AR). AR systems allow users to be aware of perfectly spatial registered information from simple two-dimensional (2D) labels to three-dimensional (3D) labels or virtual markers.

AR techniques allow users to see buildings, objects, and tools superimposed with computer-generated virtual annotations. Unlike its cousin virtual reality (VR), AR en-hances the real environment rather than replacing it with computer-generated imagery. Graphics are superimposed on the user’s view of the real environment.

Early adoptions of AR interfaces can be found in information systems where spatially registered 3D information can improve the performance of users. Current application areas that incorporate AR interfaces include industrial training [5, 34, 35, 36], computer-aided surgery [1], homeland security and military information systems [4, 14, 18, 19, 20], computer visualization, engineering design, interior design and modeling [8, 16], computer-assisted instruction (CAI) [7, 34, 35, 36], and entertain-ment [2, 13, 26].

One of the most promising applications of AR is the display of computer-generated information to guide the work of a user to specific spatial locations such as buildings, tools, packages, and other assets tracked by database systems. The ability to overlay and register any type of information on the working environment in a spatially meaningful way allows AR to be a more effective medium for information display.

Studies of user performance in AR-based information systems indicate that they can provide unique human factors benefits—as compared to approaches using traditional printed manuals or other computer-based approaches—such as improved task perfor-mance, decreased error rates, and decreased mental workload [34, 35, 36]. Information objects such as labels, overlays, 3D objects, and other information are integrated into the physical environment. Objects, tasks, and locations can be cued when appropriate to support navigation and mobile active user tasks.

“The pervasiveness, mode of delivery, and degree of control over information sys-tems in organizations have been evolving continually” [37, p. 6]. Increased network access via heterogeneous wireless network topologies enables mobile users to have “anytime, anywhere” access of information for work and personal communication [6]. The rapid proliferation of mobile information services such as cellular phones, short message services (SMS), and global positioning systems (GPS) have created an

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS 165

array of new mobile location-based services. For example, users’ real-time geospatial information can be incorporated into mobile permission marketing [15] to create a new “location-based mobile marketing” service.

Mobile AR are the most compatible systems for geospatial data as the systems are designed to register virtual information to locations in space far more precisely than the typical geographic information system (GIS). An example is the use of AR to tightly integrate medical 3D data (e.g., CAT scans, MRI images) with the patient’s body during surgery [1, 29]. This capability creates the potential for location-based services that provide an additional dimension to existing information systems and services—the guidance of user mobile attention to any spatial location for guidance, alerts, navigation, or object retrieval.

At the user level, mobile interfaces that can continuously guide users place demands on user attention. However, “despite the rapid growth of mobile telephony and the mobile Internet, research concerning m-commerce interfaces is still in the early stages” [17, p. 98]. Mobile information-rich applications of AR systems begin to push up against a fundamental human factors limitation, the limited attention capacities of the human cognitive system. For example, cell phones split attention between virtual information (i.e., a caller talking about a different spatial context) and the demands of the user’s physical environment. These attention demands of mobile interfaces such as cellular phones appear to contribute to automobile accidents [28, 33].

If AR interfaces are to guide user attention in real time, then a fundamental interface issue needs to be addressed: How can an AR system successfully manage and guide visual attention to places in the environment where critical information or objects are present, even when they are not within the visual field? To describe the problem another way: What does a 3D omnidirectional cursor look like? This question is part of a larger set of issues that we refer to as attention management and augmentation in mobile AR and VR interfaces.

Example Scenarios Where Visuospatial Cueing Can Support User Search and Navigation

To illustrate the benefits of managing visuospatial attention using a mobile AR infor-mation system, consider the following common scenarios.

Telecollaborative Spatial Cueing

An emergency paramedic wears a head-mounted camera and an AR head-mounted display (HMD) while collaborating with a remote physician during a medical emer-gency. The remote physician is viewing the scene through the camera and needs to “point” to a piece of equipment that the technician must use next. What is the quickest way to direct the technician’s attention to the correct tool among a large and cluttered set of alternatives, especially if the tool tray is outside the technician’s visual field and he or she does not know the subtle difference between a Schroeder and a Pozzi tenaculum forcep?

166 BIOCCA, OWEN, TANG, AND BOHIL

Object Search

A warehouse worker uses a mobile AR information system to manage inventory, and is searching for a specific box in an aisle stocked with dozens of virtually identical boxes. Based on inventory records of the information systems integrated into the warehouse, the box is stored on a shelf behind the user. What is the most efficient way to signal the location to the user?

Procedural Cueing During Training

A trainee repair technician uses an AR system to learn a sequence of procedural steps where parts and tools are used to repair complex manufacturing equipment. How can the computer best indicate which tool and part to select next in the procedural sequence, especially when the parts and tools may be distributed throughout a large workspace?

Spatial Navigation

A service repair technician with a personal digital assistant (PDA) equipped with the GPS is looking for a specific building and piece of equipment in a large office complex with many similar buildings. The building is around the corner down the street. What is the fastest way to signal a walking path to the front door of the building?

Attention Management

ATTENTION IS ONE OF THE MOST LIMITED MENTAL RESOURCES [30]. Attention is used to focus the human cognitive capacity on a certain sensory input so that the brain can concentrate on processing information of interest. Attention is primarily directed internally, from the “top down” according to the current goals, tasks, and larger dis-positions of the user. Attention, especially visual attention, can also be cued by the environment. For example, attention can be user driven, that is, “find the screwdriver,” collaborator driven, “use this scalpel now,” or system driven, “please use this tool for the next step.”

Attention management is a central human–computer interaction issue in the design of interfaces and devices [12, 24]. For example, the attention demands of current in-terfaces such as cellular phones and PDAs may play a significant role in automobile accidents [28, 33]. The scenarios from the previous section illustrate various cases where attention must be guided, augmented, or managed by the AR system or by a remotely communicating user.

Attention Cueing in Existing Information Interfaces

Users and interface designers have evolved various ways to direct visual attention in interpersonal interaction, architectural settings, and standard interfaces.

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS 167

Attention Cueing During Interpersonal Interaction

In interpersonal interaction, there are various sets of cues that are labeled indexical cues. The phrase comes from the most obvious cue to visual attention, the pointing of an index finger directing the eyes to “look there.” Similarly, we learn early in life to monitor movement of other people’s gaze, “drawing” a mental vector to the spa-tial location of the person’s visual attention. These virtual vectors create an implicit cue of “look there.” Gestures, eye movement, and various other linguistic cues help disambiguate otherwise confusing spatial terms in languages such as “this,” “that,” “over there,” and vague descriptive references to objects or locations in space.

Spatial linguistic cues can be the most ambiguous spatial cues. The meaning of spatial language (e.g., “left,” “here,” “in front of”) varies with respect to the spatial reference frame of the speaker, listener, and the environment. For areas that need accuracy (e.g., boating, theater), conventions are used (e.g., stage left, dolly in, port, starboard) to partially resolve this ambiguity problem, but the language in common usage does not include this level of specialization.

The ambiguity of spatial language creates major communication problems when an information system needs to communicate spatial content to a user, or when another person communicates to the user remotely through an AR or other collaborative system. Neither natural language nor nonverbal interactions in current interfaces are sufficient for complex and remote interactions.

Spatial Cueing in Windows Interfaces

WIMP (window, icon, menu, and pointer) interfaces benefit from the assumption that the user’s visual attention is directed to the limited real estate of the screen. Visual cues such as flashing cursors, pointers, radiating circles, jumping centered windows, color contrast, or content cues are used to direct visual attention to spatial locations on the screen surface. The integration of audio with visual cues helps draw attention even when vision is not directed to the screen.

Of course, these systems work within the confines of a very limited physical area, an area so small that most users can scan it quickly. These techniques cannot easily cue objects in the 3D environment around a mobile user, for example, pointing at a tool, building, or team member located behind a user equipped with a PDA.

Spatial cueing techniques used in interpersonal communication, WIMP interfaces, and architectural environments are not easily transferred to mobile systems, be they PDAs, tablet PCs, or mobile AR systems.

In mobile AR environments, attention is shared and spread across many tasks in the physical and virtual environment. Tasks in the virtual space may not be the primary user task. This is very different from typical computer tasks such as word processing in standard WIMP interfaces. For example, individuals may be walking freely in the environment, working with physical tools and objects, and interacting with others while processing virtual information. The user may not be at the correct location in the scene, or looking at the correct spatial location or information needed to accomplish a task.

168 BIOCCA, OWEN, TANG, AND BOHIL

When communicating with remote users, the indexical cues of interpersonal com-munication are not available or are presented in a decreased modality, so finger-pointing and eye gazing are useless and linguistic references to “this,” “that,” and “over there” are even more ambiguous than in direct communication.

Spatial Cursors and Cueing Techniques in Augmented Reality Systems

Currently, there are few, if any, general mobile interface paradigms to quickly direct spatial attention to information or locations anywhere in the environment. In mobile AR environments, the volume of information is potentially vast and omnidirectional. AR environments have the capacity to display large amounts of informational cues to physical objects in the environment.

Responsiveness is important for mobile multitasking computing environments. In a mobile multitasking setting, a user’s ability to detect specific virtual or physical information at the appropriate time is limited. Visual attention is even more limited, because the system may have information about objects anywhere in an omnidirec-tional working environment around the user. Visual attention is limited to the field of view of human eyes (< 200 degrees), and this limitation is often further narrowed by the field of view of HMDs (< 80 degrees).

Alternative Interface Approaches

We are introducing the omnidirectional attention funnel, a unique, generalizable interface design for mobile information search. To place the development of the at-tention funnel in context, we provide a review of alternative approaches to the same common problem.

Simple and Spatial Audio Cueing

In collaborative applications of mobile phones, the simplest and most common tech-nique for cueing the location of objects is language—that is, “The red box should be on our left.” The ambiguity and limitations of this method have been discussed, and are especially limiting when response time is a factor or the language cannot be presented in an interrogatory setting, where users can ask questions that help to resolve ambiguities.

An alternative audio cueing method for mobile systems is the use of stereo spatial audio to produce directional audio cues. These have been used for guidance for the blind and sighted [21, 23]. Spatial audio and the human auditory systems do not have the spatial resolution to inform spatial location precisely [31] and localization can be slow, especially in a noisy auditory field [25].

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS 169

WIMP Cursor and Highlighting Techniques

Many AR systems adopt WIMP cursor techniques or visual highlighting to direct users’ attention to an object (e.g., [7, 22]). Pointers in space appear over the object of attention or the object is outlined as a wire diagram. These techniques may not be effective for mobile AR systems. Highlighting techniques, such as highlighting a whole building, assumes that a detailed virtual model of the object, building, or tool is known. AR systems often need to direct attention to real-world objects, and virtual models generally do not exist even if a GPS or RFID location is known. Also, cues such as highlighting or cursors assume that the user is looking in the direction of the cued object (i.e., that it is on the screen or in the display). The cued objects may be off to the side or behind the user.

Maps

In mobile systems, maps are sometimes used to cue the GPS or spatial location of buildings, and so on. Maps may be adequate for very large objects such as buildings, but become ambiguous when cueing the location of small objects such as tools (for example, one of several emergency medical tools such as a scalpel). When maps are utilized, users must spatially correlate the map image with the surroundings, mentally transferring the marked location to the real world, a sometimes daunting task.

Omnidirectional Attention Funnel: A Cursor Paradigm for Mobile 3D Interaction

THE LIMITED IMPLEMENTATION OF A GENERAL TECHNIQUE for directing visual attention in 3D space suggests that interface design in a mobile AR system presents three basic challenges in managing and augmenting the attention of the user:

1. Omnidirectional cueing. How to quickly and successfully cue visual attention to any location of physical or virtual information when there is an immediate need.

2. Minimal attention demands. How to keep virtual information from consum-ing or interfering with attention to tasks, objects, or navigation in the physical environment.

3. General applicability. How to provide a general technique that helps users find and interact with physical or virtual objects at various distances while the user is mobile.

To meet these challenges, we have designed a new spatial interface concept, called the Omnidirectional Attention Funnel, as part of the Mobile Infospaces project, a multiyear collaborative effort that examines human factors issues in the design of high volume, mobile AR systems. The attention funnel interface techniques are designed as a general purpose interface paradigm that addresses the broad range of attention management

170 BIOCCA, OWEN, TANG, AND BOHIL

challenges of mobile AR systems implemented on various platforms from high-end head-mounted wearable systems to tablet PCs, PDAs, or smart phones.

The omnidirectional attention funnel is an AR display technique for rapidly guid-ing visual attention to any location in physical or virtual space. The fundamental components of the attention funnel are illustrated in Figures 1 and 2. The most vis-ible component is the set of dynamic 3D virtual objects linking the view of the user directly to the virtual or physical object. In spatial cognitive terms, the attention funnel visually links a head-centered coordinate space directly to an object centered coordinate space, funneling focal spatial attention of the user to the cued object. The attention funnel takes advantage of spatial cueing techniques impossible in the real world, along with AR’s ability to dynamically overlay 3D virtual information onto the physical environment.

Like many AR components, the AR funnel paradigm consists of (1) a display tech-nique, the attention funnel, combined with (2) methods for tracking and detecting the location of objects to be cued.

Components of the Attention Funnel

To test and demonstrate the concept, the attention funnel interface component was implemented as a user interface widget designed for mobile AR applications in the ImageTclAR development environment [27]. This interface widget provides a

Figure 1. Illustration of the Attention FunnelNote: The attention funnel links the head of the viewer directly to an object anywhere around the body.

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS 171

mechanism for drawing visual attention to locations, information, or paths in an AR environment.

The basic components of the attention funnel, as illustrated in Figure 2, are

1. a view plane with a virtual bore-sight in the center and a pointer arrow above;

2. a dynamic set of increasingly smaller funnel planes; 3. 3D “crosshairs” targeting the object location; and 4. a curved, dynamic path (see Figures 1 and 3) linking the head or viewpoint of

the user and all the elements directly to the object.

Along the curved dynamic path, the funnel planes are repeated in space and normal to the line. We refer to this line and the repeated patterns as an attention funnel. The path drawn for near objects is defined by a Hermite curve [10]. A Hermite curve is a cubic curve segment defined by a start location, end location, and derivative vectors at each end. The curve follows a path from the starting point in the direction of the starting end derivative vector. It ends at the end point with the curve approaching the end point in the direction of the derivative vector. As a cubic curve segment, the curve presents a smoothly changing path from the start point (i.e., the user’s view plane) to the end point (i.e., the 3D “crosshairs” target) with curvature controlled by the magnitude of the derivative vectors. Hermite curves are a standard cubic curve method. Figure 3 clearly illustrates the curvature of the funnel from a bird’s-eye view.

The start point for the Hermite curve is located at a specified distance in front of the origin in a frame defined to be the viewpoint of the user (the center of projection for a single viewpoint or average of two viewpoints for stereo viewers). The curve terminates at the target. The curve is a cubic interpolating curve that creates a smoothly varying path from start to target. The derivative vectors that specify the end curvatures of the curve are selected so as to emit an attention funnel in the view direction that

Figure 2. Basic Component of an Attention FunnelNotes: Three basic patterns are used to construct a funnel: (A) the head-centered plane includes a bore-sight to mark the center of the pattern from the user’s viewpoint; (B) funnel planes, added in a fixed pattern (approximately every 0.2 meters) between the user and the object; and (C) the object marker pattern, which includes crosshairs marking the approximate center of the object.

172 BIOCCA, OWEN, TANG, AND BOHIL

approaches the target from the viewer’s direction. The curvatures of the starting and ending points are specified in the application.

The orientation of each pattern along the visual path is obtained by spherical linear interpolation of the up direction of the source frame and the up direction of the target frame, so as to transition from an alignment with the view frame to an upright align-ment with the target. Spherical linear interpolation was introduced to the computer graphics society by Shoemake [32], and it is different from linear interpolation in that the angle between each interval is constant—that is, the changes of orientations of the patterns are smooth. The formula used is:

Figure 3. Illustration of the Attention Funnel from a Bird’s-Eye ViewNotes: As the head and body move, the attention funnel dynamically provides continuous feedback. Affordances from the perspective cues automatically guide the user toward the cued location or object. Dynamic head movement cues are provided by the skew (e.g., left, right, up, down) of the attention funnel. The level of alignment (skew) of the funnel provides an immediate intuitive sense of how much the body or head must turn to see the object.

ν νθ

θν

θθ

tt t( ) =

−( )( )( ) +

( )( )1 2

1sin

sin

sin

sin.

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS 173

In this equation, t ∈ [0,1], and θ is the angle between

ν1 and

ν2 computed as

The computational cost of this method is very small, involving the solution of the cubic curve equation (three cubic polynomials), the spherical interpolation equation, and a rotation matrix for each pattern display location.

The purpose of the attention funnel is to draw visual attention to a target physical or virtual object when it is not properly directed. When the user is looking in the desired direction, the attention funnel becomes superfluous and can cause visual clutter and distraction. The solution to this case is to fade the funnel planes to only the view plane and target 3D crosshairs as the dot product of the source and target derivative vector approaches 1, indicating the direction to the target is close to the view direction.

Affordances in the Attention Funnel that Guide Navigation and Body Rotation

The attention funnel uses various overlapping visual cues that guide body rotation, head rotation, and gaze direction of the user.

Building on an attention sink pattern introduced by Hochberg [11], the attention funnel uses strong perspective cues as shown in Figure 4. Each attention funnel plane has diagonal vertical lines that provide depth cueing toward the center of the pattern. Each succeeding funnel plane is placed so that it fits within the preceding plane when the planes are aligned in a straight line. Increasing degrees of alignment cause the interlocking patterns to draw visual attention toward the center. Three basic patterns are used to construct a funnel: (1) the head-centered plane includes a bore-sight to mark the center of the pattern from the user’s viewpoint; (2) funnel planes, added in a fixed pattern (currently every 12 centimeters) between the user and the object; and (3) the object marker pattern, which includes a bounding box marking the approximate center of the object. Patterns 1 and 3 are used for dynamically cueing the user that they have “locked onto” the object (see below).

As the head and body move, the attention funnel provides continuous feedback that indicates to the user how to turn his or her body or head toward the cued location or object. Continuous dynamic head movement cues are provided by the skew (e.g., left or right) of the attention funnel. The pattern of the funnel provides an immediate intuitive sense of the location of the object relative to the head. For example, if the funnel skews to the right, then the user knows to move his or her head to the right (e.g., more skewing suggests that more body rotation is needed to see it). The funnel continuously changes, providing a dynamic cue that one is getting closer to being “in sync” and locked onto the cued object. When looking directly at the object, the funnel fades so as to minimize visual clutter. A target behind the user is indicated by a funnel that moves forward for visibility, then turns and heads behind the user, a clear visual cue.

θ ν ν= ( )−cos .11 2� �.

174 BIOCCA, OWEN, TANG, AND BOHIL

Methods for Sensing or Marking Target Objects or Locations

Attention funnels are applicable to any augmented vision display technology capable of presenting 3D graphics including HMDs and video see-through devices such as tablet PCs or handheld computers. The location of target objects or locations in the environment may be known to the system because they are (1) virtual objects in tracked 3D space, (2) tagged with sensors such as visible markers or RFID tags, or (3) predefined spatial locations as in GPS coordinates. Virtual objects in tracked 3D space are the most straightforward case, as the attention funnel can link the user to the location of the target virtual object dynamically. Objects tagged with RFID tags are not necessarily detectable at a distance, but local sensing in a facility may be sufficient to indicate a position that can be utilized for attention direction.

In some cases, the location of the object is detected by sensors and is not known ahead of time. An implementation we are currently exploring involves the detection of visible markers with omnidirectional cameras, which can be implemented in a video see-through or optical see-through system. (Note that this implementation is different from the traditional video see-through system, where the only camera used represents the viewpoint of the user.) The head-mounted omnidirectional camera detects markers in a 360-degree environment around the user. The relation of the camera to the user’s viewpoint is known. Detected objects can be cued for the user based on task needs or search requests by the user (e.g., “find the tool box”).

Figure 4. Example of the Attentional Funnel Drawing Attention of the User to an Object on the Shelf—the Box

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS 175

User Evaluation in a Visual Search and Retrieval Task

DOES THE ATTENTION FUNNEL TRULY DIRECT user attention more efficiently than the most common techniques used in current AR interfaces? We conducted a study to evaluate the effectiveness of the attention funnel in guiding attention around the immediate space of the user [3].

A common task for an AR cursor system in a mobile setting is to guide a user to an object that the user needs to retrieve in the immediate environment. The attention funnel paradigm was tested against two alternative techniques: (1) a commonly used AR highlighting technique, where the target object is cued by a surrounding green bounding box, and (2) a control condition mimicking interpersonal interaction, where the object to be found is indicated only by its name (e.g., “pick up the screwdriver”). A 360-degree omnidirectional workspace was created using four tables as shown in Figure 5. Forty-eight objects were distributed over the four tables (12 objects each). Half of these objects were primitive geometric objects of different colors and the other half recognizable tools (e.g., screwdriver, stapler, and notebook).

Methodology

A within-subjects experiment was conducted to test the performance of the atten-tion funnel design against other conventional attention direction techniques—visual highlighting and verbal cues. The experiment had one independent variable, the method used for directing attention, with three alternatives: (1) the attention funnel, (2) visual highlight techniques, and (3) a control condition consisting of a simple linguistic cue common in current mobile phones (i.e., “look for the red box.”)

Participants

Fourteen paid participants drawn from a university student population participated in the study.

Stimulus Materials and Test Environment

Three interface metaphors for directing visuospatial attention were designed and implemented: (1) the attention funnel, (2) visual highlighting of the spatial location of the object, and (3) an audio instruction interface using a verbal description of an object.

Attention Funnel Condition

In the attention funnel interface, a series of linked rectangles dynamically links the visual field to the spatial location of the target object.

176 BIOCCA, OWEN, TANG, AND BOHIL

Visual Highlight Condition

For the visual highlight interface, a 3D bounding box was placed so as to appear spatially registered at the location of the target object.

Audio Instruction Condition

For the audio instruction condition, visual search was directed by playing a prerecorded audio description of the target object for the user via a pair of headphones (e.g., “Please grab the [item]”). Each audio cue took approximately 1.5–2 seconds to play.

Apparatus and Test Environment

A 360-degree omnidirectional workspace was created using four tables as shown in Figure 5. Twelve objects were placed on each table: six primitive objects of different colors (e.g., red box, black sphere) on a shelf, and six general objects (e.g. stapler, notebook) on the tabletop.

Visual cues were displayed in stereo with the Sony Glasstron LDI-100B HMD, and audio stimulus materials were presented with a pair of headphones. Head motion

Figure 5. Test EnvironmentNote: The user sat in the middle of the test environment for the visual search task. It consisted of an omnidirectional workspace assembled from four tables, each with 12 objects (six primitive shapes and six general office objects), for a total of 48 target search objects.

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS 177

was tracked by an Intersense IS-900 ultrasonic/inertia hybrid tracking system. Stereo graphics were rendered in real time based on the data from the tracker. A pressure sensor was attached to the thumb of a glove to capture the reaction time when the subject grasped the target object.

Presentation of stimulus materials, audio instructions for participants, experimental procedure sequencing, and data collection for the experiment were automated so that the experimenter did not need to manually record the experimental results. The experi-ment was developed in the ImageTclAR AR development environment [27].

Measurements

Search Time, Error, and Variability

Search time in milliseconds was measured as the time it took for participants to grab a target object from among the 48 objects following the onset of an audio cue tone. The end of the search time was triggered by the pressure sensor on the thumb of the glove when the user touched the target object. An error was logged for cases when participants selected the wrong object.

Mental Workload

Participants’ perceived task workload in each condition was measured using the NASA Task Load Index after each experimental condition [9].

Procedure

Participants entered a training environment where they were introduced and trained to use each interface (audio, visual highlight, attention funnel). They then began the experiment. Each subject experienced the interface treatment conditions (audio, visual highlight, and attention funnel) and each object search trial in a randomized order. For each condition, participants were cued to find and touch one of the 48 objects in the environment as quickly and accurately as possible. Participants participated in 24 trials balanced such that 12 trials involved searching for a random selection of primitive objects and 12 trials involved randomly selected general everyday objects.

Results

A general linear model repeated measure analysis of variance (ANOVA) was con-ducted to test the effect of metaphors on the different performance indicators. There was a significant effect of interface type on search time, F(2,13) = 10.031, p < 0.001, and on search time consistency (i.e., smallest standard deviation), F(2,13) = 23.066, p < 0.000. The attention funnel interface clearly allows subjects to find objects in the least amount of time and with the most consistency (mean [M] = 4473.75 milliseconds [ms], standard deviation [SD] = 1064.48) compared to the visual highlight interface

178 BIOCCA, OWEN, TANG, AND BOHIL

(M = 6553.12, SD = 2421.10) and the audio only interface (M = 4991.94 ms, SD = 3882.11), which had the largest standard deviation. See Figure 6.

There was a significant effect of interface type on the participants’ perceived mental workload, F(2,14) = 4.178, p < 0.05. The results indicate that the attention funnel interface has the lowest mental workload (M = 44.64, SD = 16.96), comparing to the visual highlight interface (M = 54.57, SD = 18.26) and the audio interface (M = 55.57, SD = 12.43). See Figure 7.

There was no significant effect of interface type on error, F(2,13) = 1.507, p < 0.05 (attention funnel, M = 1.14, SD = 0.77; visual highlight, M = 1.43, SD = 1.56; audio, M = 0.86, SD = 1.03).

Discussion

When compared to standard cueing techniques such as visual highlighting and audio cueing, we found that the attention funnel decreased the visual search time by 22 percent overall, or approximately 28 percent for the visual search phase alone, and 14 percent over its next fastest, as shown in Figure 6. While increased speed is valuable in some applications of AR, such as medical emergency and other high-risk applica-tions, it may be critical that the system support the user’s consistent performance. The

Figure 6. Search Time and Consistency by Experimental ConditionNote: Attentional funnel decreased search time by 22 percent on average (28 percent when reach time is subtracted) and increased search consistency (decreased variability) by 65 percent.

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS 179

attention funnel had a very robust effect on making the user search consistently, with significantly lower standard deviation comparing with the other two cueing techniques. The interface increased user’s consistency by an average of 65 percent and 56 percent over the next best interface.

A key criterion for a mobile interface is the need for minimal attention demand. In cases where AR environments are used for emergency services, repair work, other time-critical and attention-demanding applications, search time may require costly mental effort. The effects of interface type of mental workload are illustrative, as shown in Figure 7. Cueing users with only audio, which involved holding the object in memory, required additional mental workload. But visual highlighting techniques, which demand less memory, demanded additional mental workload, possibly because of the uncertainty of where to search. The attention funnel, which placed limited de-mand on memory and which directed search immediately and continuously, provided an 18 percent decrease in mental workload.

In summary, the attention funnel led to faster search and retrieval times, greater consistency of performance, and decreased mental workload when compared to verbal cueing and visual highlighting techniques.

Limitations

THE ATTENTION FUNNEL WAS DESIGNED as a unique interface technique for directing and guiding users’ attention to any location in 4π steradians. The approach is unique and

Figure 7. Mental Workload Measured by NASA TLX [9] for Each Experimental Condition

180 BIOCCA, OWEN, TANG, AND BOHIL

patent pending. As indicated above, current techniques used in 3D games and simu-lations, such as the highlighting of 3D objects, are not feasible in real-world scenes. No virtual 3D model will preexist for most real-world objects such as buildings, packages, tools, and so on, even if the location is known using global positioning or RFID tags.

As there is no standard, we tested the attention funnel against the most commonly used AR techniques [3]. This presents a limitation to the current study, as the logical comparison is a set of possible or unknown techniques, which have not been imple-mented. We are currently implementing and exploring other possible cueing techniques such as 3D arrows, lines, and so on.

Furthermore, an ideal test of the attention funnel would take place in complex, outdoor environments with fully mobile individuals cued to find objects within and far outside of reach. This would add ecological validity to the findings.

Application of the Attention Funnel to Various Mobile and 3D Interfaces

THE ATTENTION FUNNEL PARADIGM INVOLVES basic techniques that have potentially broad applicability in AR and VR interfaces: a user’s attention has to be directed to objects or locations in order to accomplish tasks.

Broadly, the attention funnel techniques can support user performance in the fol-lowing generic classes of fundamental AR tasks:

• Physical object selection. Situations in which a user may be looking for a physi-cal object in space; for example, a tool on a workbench, a box in a warehouse, a door in space, the next part to assemble during object assembly, and so on. The system can direct the user to the correct object.

• Virtual object selection. An AR system may insert labels or 3D objects inside the environment. These may be within or outside the current view of the user. Attention funnels can cue them to look at the spatially registered label, tool, or cue.

• Visual search in a cluttered space. The user may be searching in a highly cluttered natural or artificial environment. An attention funnel can be used to cue them to the correct location to view, even if they are not looking in the right place.

• Navigation in near space. The system might also need to direct the walking path of the individual through near space (e.g., through aisles, etc.). A directional funnel path (a slightly different implementation than the attention funnel above) can be used to indicate and cue the user’s direction, and provide dynamic cues as to path accuracy.

• Navigation in far space. An attention funnel can direct users to distant landmarks. As an example, someone walking toward an office several blocks away must maintain a link to the landmark as they navigate through an urban environment, even when landmarks are obscured.

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS 181

With the success of AR systems, designers will seek to add potentially rich, even unlimited, layers of virtual information onto physical space. As AR systems are used in various real, demanding, mobile applications such as manufacturing assembly, warehousing, tourism, navigation, training, and distant collaboration, interface tech-niques appropriate to the AR medium will be needed to manage the mobile user’s limited attention, improve user performance, and limit cognitive demands for optimal spatial performance. The AR attention funnel paradigm represents an example of cog-nitive engineering interface techniques for which there is no real-world equivalent, and which is specifically adapted for users of AR systems navigating and working in information- and object-rich environments.

Future Work

WE ARE CURRENTLY IMPLEMENTING the attention funnel technique on other mobile de-vices, including handheld devices such as PDAs and cell phones. The attention funnel can be overlaid on a live video stream captured by a handheld camera, while spatial location of the user can be determined using GPS, digital compass, or triangulation of cellular or RFID signals. Figure 8 illustrates the implementation of the attention funnel technique on a tablet PC. The attention funnel technique has some important implications to usability of location-based consumer information systems. As an ex-ample, the attention funnel can be used to display navigation information generated by commercial GISs (e.g., Microsoft Mappoint, Google Maps) from a first-person perspective, as illustrated in Figure 9. The attention funnel technique can also be used to display location-based touring alert information to a mobile user via a PDA or cell phone (e.g., the location of a shop or restaurant can be cued by an attention funnel displayed on the screen of a PDA).

Figure 8. Implementation of the Attention Funnel Technique on a Tablet PCNote: The attention funnel can be used with a tracker-enabled tablet PC. In this implementation, the tablet PC (or smart phone) acts as a “magic” window upon scenes annotated with information.

182 BIOCCA, OWEN, TANG, AND BOHIL

Acknowledgments: The authors acknowledge the assistance of Betsy McKeon, Amanda Hart, and Mark Rosen in the preparation of this paper. They also appreciate the suggestions and rec-ommendations provided by the three anonymous reviewers on an earlier version of this paper. This project is one element of the Mobile Infospaces project and supported in part by a grant from the National Science Foundation CISE 02–22831. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

REFERENCES

1. Bajura, M.; Fuchs, H.; and Ohbuchi, R. Merging virtual objects with real world: Seeing ultrasound imagery within the patient. Computer Graphics, 26, 2 (1992), 203–210.

2. Bimber, O. Video see-through AR on consumer cell phones. In Proceedings of the Third IEEE and ACM International Symposium on Mixed and Augmented Reality. Los Alamitos, CA: IEEE Computer Society Press, 2004, pp. 252–253.

3. Biocca, F.; Tang, A.; Owen, C.; and F., Xiao. Attention funnel: Omnidirectional 3D cursor for mobile augmented reality platforms. In R. Grinter, T. Rodden, P. Aoki, E. Cutrell, R. Jef-fries, and G. Olson (eds.), Proceedings of the ACM CHI 2006, Conference on Human Factors in Computer Systems. New York: ACM Press, 2006, pp. 1115–1122.

4. Brown, D.; Stripling, R.; and Coyne, J. Augmented reality for urban skills training. In Proceedings of IEEE Virtual Reality Conference 2006. Los Alamitos, CA: IEEE Computer Society Press, 2006, pp. 249–252.

5. Caudell, T., and Mizell, D. Augmented reality: An application of heads-up display tech-nology to manual manufacturing processes. In Ralph H. Sprague Jr. (ed.), Proceedings of the Twenty-Fifth Annual Hawaii International Conference on System Sciences. Los Alamitos, CA: IEEE Computer Society Press, 1992, pp. 659–669.

6. Fang, X.; Chan, S.; Brzezinski, J.; and Xu, S. Moderating effects of task type on wireless technology acceptance. Journal of Management Information Systems, 22, 3 (Winter 2005–6), 123–157.

Figure 9. An Illustration of a “Navigation” Funnel, Drawn on the Real-World Scene to Guide an Individual to Distant Objects or Destinations

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS 183

7. Feiner, S.; MacIntyre, B.; and Seligmann, D. Knowledge-based augmented reality. Com-munications of the ACM, 36, 7 (1993), 52–62.

8. Feiner, S.; Webster, A.; Krueger, T.; MacIntyre, B.; and Keller, E. Architectural anatomy. Presence: Teleoperators and Virtual Environments, 4, 3 (1995), 318–325.

9. Hart, S. Development of NASA-TLX (task load index): Results of empirical and theo-retical research. In P. Hancock and N. Meshkati (eds.), Human Mental Workload. Amsterdam: North-Holland, 1988, pp. 239–250.

10. Hearn, D., and Baker, M.P. Computer Graphics, C Version. Upper Saddle River, NJ: Prentice Hall, 1996.

11. Hochberg, J. Representation of motion and space in video and cinematic displays. In K. Boff, L. Kaufman, and J. Thomas (eds.), Handbook of Perception and Human Performance, vol. 1. New York: Wiley, 1986, pp. 22.1–22.64.

12. Horvitz, E.; Kadie, C.; Paek, T.; and Hovel, D. Models of attention in computing and communication: From principles to applications. Communications of the ACM, 46, 3 (2003), 52–59.

13. Jebara, T.; Eyster, C.; Weaver, J.; Starner, T.; and Pentland, A. Stochasticks: Augmenting the billiards experience with probabilistic vision and wearable computers. In Proceedings of the First International Symposium on Wearable Computers. Los Alamitos, CA: IEEE Computer Society Press, 1997, pp. 138–145.

14. Julier, S.; Baillot, Y.; Lanzagorta, M.; Brown, D.; and Rosenblum, L. BARS: Battlefield augmented reality system. Paper presented at the NATO Symposium on Information Processing Techniques for Military Systems, Istanbul, Turkey, October 2000.

15. Kavassalis, P.; Spyropoulou, N.; Drossos, D.; Mitrokostas, E.; Gikas, G.; and Hatzista-matiou, A. Mobile permission marketing: Framing the market inquiry. International Journal of Electronic Commerce, 8, 1 (Fall 2003), 55–79.

16. Klinker, G.; Stricker, D.; and Reiners, D. Augmented reality for exterior construction applications. In W. Barfield and T. Caudell (eds.), Fundamentals of Wearable Computers and Augmented Reality. Mahwah, NJ: Lawrence Erlbaum, 2001, pp. 379–427.

17. Lee, Y., and Benbasat, I. A framework for the study of customer interface design for mobile commerce. International Journal of Electronic Commerce, 8, 3 (Spring 2004), 79–102.

18. Livingston, M.; Brown, D.; Julier, S.; and Schmidt, G. Military applications of augmented reality. Paper presented at the NATO Human Factors and Medicine Panel Workshop on Virtual Media for Military Applications, West Point, June 2006.

19. Livingston, M.; Rosenblum, L.; Julier, S.; Brown, D.; Baillot, Y.; Swan, E., II; Gabbard, J.; and Hix, D. An augmented reality system for military operations in urban terrain. Paper presented at the Interservice/Industry Training, Simulation and Education Conference, Orlando, FL, December 2002.

20. Livingston, M.; Swan, E., II; Julier, S.; Baillot, Y.; Brown, D.; Rosenblum, L.; Gabbard, J.; and Höllerer, T. Evaluating system capabilities and user performance in the battlefield augmented reality system. Paper presented at the Performance Metrics for Intelligent Systems Workshop, Gaithersburg, MD, August 2004.

21. Loomis, J.; Golledge, R.; and Klatzky, R. Navigation system for the blind: Auditory display modes and guidance. Presence: Teleoperators and Virtual Environments, 7, 2 (1998), 193–203.

22. Mann, S. Telepointer: Hands-free completely self contained wearable visual augmented reality without headwear and without any infrastructural reliance. In Proceedings of Fourth International Symposium on Wearable Computers. Los Alamitos, CA: IEEE Computer Society Press, 2000, pp. 177–178.

23. Marston, J.; Loomis, J.; Klatzky, R.; Golledge, R.; and Smith, E. Evaluation of spatial displays for navigation without sight. ACM Transactions on Applied Perception, 3, 2 (2006), 110–124.

24. McCrickard, D., and Chewar, C. Attentive user interface: Attuning notification design to user goals and attention costs. Communications of the ACM, 46, 3 (2003), 67–72.

25. Middlebrooks, J., and Green, D. Sound localization by human listeners. Annual Review of Psychology, 42 (1991), 135–159.

26. Ohshima, T.; Satoh, K.; Yamamoto, H.; and Tamura, H. AR2 hockey system: A collab-orative mixed reality system. Transactions of the Virtual Reality Society of Japan, 3, 2 (1998), 55–60.

184 BIOCCA, OWEN, TANG, AND BOHIL

27. Owen, C.; Tang, A.; and Xiao, F. ImageTclAR: A blended script and compiled code development system for augmented reality. Paper presented at STARS2003: The International Workshop on Software Technology for Augmented Reality Systems, Tokyo, Japan, 2003.

28. Redelmeier, D.A., and Tibshirani, R.J. Association between cellular telephone calls and motor vehicle collisions. New England Journal of Medicine, 336, 7 (1997), 453–458.

29. Rolland, J.; Wright, D.; and Kancherla, A. Towards a novel augmented-reality tool to visualize dynamic 3D anatomy. In K. Morgan, H. Hoffman, D. Stredney, and S. Weghorst (eds.), Proceedings of Medicine Meets Virtual Reality 5. 1997, pp. 337–348.

30. Shiffrin, R. Visual processing capacity and attentional control. Journal of Experimental Psychology: Human Perception and Performance, 5, 1 (1979), 522–526.

31. Shinn-Cunningham, B. Localizing sounds in rooms. Paper presented at the ACM SIG-GRAPH and EUROGRAPHICS Campfire: Acoustic Rendering for Virtual Environments, Snowbird, UT, May 2001.

32. Shoemake, K. Animating rotation with quaternion curves. Computer Graphics, 19, 3 (1985), 245–254.

33. Strayer, D.L., and Johnston, W. Driven to distraction: Dual-task studies of simulated driv-ing and conversing on a cellular phone. Psychological Science, 12, 6 (2001), 462–466.

34. Tang, A.; Owen, C.; Biocca, F.; and Mou, W. Experimental evaluation of augmented reality in object assembly task. In Proceedings of the First IEEE and ACM International Sym-posium on Mixed and Augmented Reality. Los Alamitos, CA: IEEE Computer Society Press, 2002, pp. 265–266.

35. Tang, A.; Owen, C.; Biocca, F.; and Mou, W. Comparative effectiveness of augmented reality in object assembly. In V. Bellotti, T. Erickson, G. Cockton, and P. Korhonen (eds.), Proceedings of ACM CHI 2003, Conference on Human Factors in Computing Systems. New York: ACM Press, 2003, pp. 73–80.

36. Tang, A.; Owen, C.; Biocca, F.; and Mou, W. Performance evaluation of augmented real-ity for directed assembly. In A. Nee and S. Ong (eds.), Virtual Reality and Augmented Reality Applications in Manufacturing. London: Springer-Verlag, 2004, pp. 301–322.

37. Zwass, V. Management information systems—Beyond the current paradigm. Journal of Management Information Systems, 1, 1 (Summer 1984), 3–10.