Attention Manipulation for Naval Tactical Picture Compilation

Attention Manipulation for Naval Tactical Picture Compilation

Tibor Bosse1, Rianne van Lambalgen

1, Peter-Paul van Maanen

1,2, and Jan Treur

1

1 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands

{tbosse,rm.van.lambalgen,treur}@cs.vu.nl 2 TNO Human Factors, Soesterberg, The Netherlands

[email protected]

Abstract

This paper presents an agent model that is able to

manipulate the visual attention of a human, in order to

support naval crew. It consists of four submodels,

including a model to reason about a subject’s

attention. A practical case study was formally

analysed, and the results were verified using

automated checking tools. Results show how a human

subject’s attention is manipulated by adjusting

luminance, based on assessment of the subject’s

attention. A first evaluation of the agent shows a

positive effect.

1. Introduction

In the domain of naval warfare, it is crucial for the

crew of the vessels involved to be aware of the

situation in the field. One of the crew members is

usually assigned the task to identify and classify all

entities in the environment (e.g., [6]). This task

determines the type and intent of a multiplicity of

contacts on a radar screen. Attention is typically

directed to one bit of information at a time [13], [14],

[16]. A supporting software agent may alert the human

about a contact if it is ignored. To this end the agent

has to maintain a model of the cognitive state of the

human including the human’s distribution of attention.

Existing cognitive models on attention show that it is

possible to predict a person’s attention based on a

saliency map, calculated from features of a stimulus,

like luminance, colour and orientation [7], [12]. In this

study, a Theory of Mind (or ToM, e.g., [4]) model is

exploited within the agent model to analyse attention

of the human. Attention can then be influenced (or

‘manipulated’) by changing features of stimuli, e.g., its

contrast with stimuli at other locations [7], [8], [11], its

luminance [15], [17], or its form [17].

Some approaches in the literature address the

development of software agents with a Theory of Mind

(e.g., [4], [9], [10]), but only address a model of the

epistemic (e.g., beliefs), motivational (e.g., desires,

intentions), and/or emotional states of other agents. For

the situation sketched above, attribution of attentional

states has to be addressed. In the current paper, an

agent model has been developed, which uses four

specific (sub)models. The first is a representation of a

dynamical model of human attention, for estimation of

the locations of a person’s attention, based on

information about features of objects on the screen and

the person’s gaze. The second model is a reasoning

model which the agent uses to reason through the first

model, to generate beliefs on attentional states at any

point in time. With a third model the agent compares

the output of the second model with a normative

attention distribution and determines the discrepancy.

Finally, a fourth model is used to direct the person’s

attention to relevant contacts based on the output of the

third model.

Initial versions of the first two models were adopted

from earlier work [5]. The current paper focuses on the

last two models. Section 2 shortly describes a

formalisation of the different models. In Section 3 a

case study is shown where the software agent is used to

manipulate a subject’s attention. Based on this case

study, Section 4 addresses experimental validation of

the results, and Section 5 automated verification of

different important properties of the submodels used in

the agent. In Section 6, a formal mathematical analysis

of the model is given. Finally, Section 7 is a

discussion.

2. A Theory of Mind for Attention

This section introduces the ambient agent with a

Theory of Mind on attention and the four different

models which are used within this agent: Dynamic

Attention Model, Model for Beliefs about Attention,

Model to Determine Discrepancy and Decision Model

for Attention Adjustment. The agent and its interaction

with the environment (involving a complex task and an

eye-tracker, see Section 3) are schematically displayed

in Figure 1.

Figure 1. Overview of the ambient its environment

2.1. The Dynamic Attention Model used

This model is taken over from [5] and is only

briefly summarised in this section. The model uses

three types of input: information about the human’s

gaze direction, about locations (or spaces)

features of objects on the screen. Based on this,

makes an estimation of the current attention

distribution at a time point: an assignment of attention

values AV(s, t) to a set of attention spaces at that time.

The attention distribution is assumed to have a certain

persistency. At each point in time the

level is related to the previous attention, byAV(s,t) = λ ⋅ AV(s,t-1) + (1 - λ) ⋅ AVnorm

Here, λ is the decay parameter for the decay of the

attention value of space s at time point t

AVnorm(s,t) is determined by normalisation

total amount of attention A(t), described by:

��norm��, � � ��new��, �∑ ��new��, ��

·

��new��, � � ��pot��, �1 � � · ��, �

Here AVnew(s, t) is calculated from the potential

attention value of space s at time point t and the

relative distance of each space s to the gaze point (the

centre). The term r(s,t) is taken as the Euclidian

distance between the current gaze point and s at time

point t (multiplied by an importance factor

determines the relative impact of the distance to the

Figure 2. Overview of attention model.

features of

objects

gaze direction

locations of

objects

mbient agent and

The Dynamic Attention Model used

This model is taken over from [5] and is only

briefly summarised in this section. The model uses

information about the human’s

(or spaces) and about

. Based on this, it

current attention

at a time point: an assignment of attention

values AV(s, t) to a set of attention spaces at that time.

The attention distribution is assumed to have a certain

persistency. At each point in time the new attention

is related to the previous attention, by

norm(s,t)

Here, λ is the decay parameter for the decay of the

attention value of space s at time point t – 1, and

normalisation for the

total amount of attention A(t), described by:

��

� ��

(s, t) is calculated from the potential

attention value of space s at time point t and the

relative distance of each space s to the gaze point (the

centre). The term r(s,t) is taken as the Euclidian

distance between the current gaze point and s at time

point t (multiplied by an importance factor α which

determines the relative impact of the distance to the

gaze point on the attentional state, which can be

different per individual and situation):

��, � � �eucl� !"#�The potential attention value AV

weighted sum of the features of the space (i.e., of t

types of objects present) at that time (e.g., luminance,

colour):

��pot��, � � $ %��, �&'(� )

For every feature there is a saliency map M, which

describes its potency of drawing attention (e.g. [7], [9],

[17]). Moreover, M(s,t) is the unweighted potential

attention value of s at time point t, and w

weight used for saliency map M, where 1

0 ≤ wM(s,t) ≤ 1.

Figure 2 shows an overview of this model.

circles denote the italicised concepts introduced above,

and the arrows indicate influences between concepts.

2.2. Model for Beliefs about Attention

This (reasoning) model is used to generate beliefs

about attentional states of the other agent. The

agent uses the dynamical system model as described in

Section 2.1 as an internal simulation model to generate

new attentional states from the previous ones, gaze

information and features of the object, with the use of a

forward reasoning method (forward in time) as

described in [2]. The basic specification of th

reasoning model can be expressed by the

representation leads_to_after(I, J, D) (belief that

J after duration D). Here, I and J are both information

elements (i.e., they may correspond to any concept

from Figure 2, e.g., gaze_at(1, 2)

0.68).

In addition, the representation

information on the world (including human processes)

at different points in time. It represents a

state I holds at time point T.

at(gaze_at(1,2), 53) expresses that at time point 53, the

human’s gaze is at the space with coordinates {1,2}.

Figure 2. Overview of attention model.

attention level

for location

normalised attention

contribution

of locations

current attention

contribution

of locations

gaze point on the attentional state, which can be

different per individual and situation):

��, ��

The potential attention value AVpot(s,t) is a

weighted sum of the features of the space (i.e., of the

types of objects present) at that time (e.g., luminance,

� � · *)��, �

For every feature there is a saliency map M, which

describes its potency of drawing attention (e.g. [7], [9],

]). Moreover, M(s,t) is the unweighted potential

time point t, and wM(s,t) is the

weight used for saliency map M, where 1 ≤ M(s,t) and

Figure 2 shows an overview of this model. The

circles denote the italicised concepts introduced above,

and the arrows indicate influences between concepts.

Model for Beliefs about Attention

This (reasoning) model is used to generate beliefs

about attentional states of the other agent. The software

agent uses the dynamical system model as described in

Section 2.1 as an internal simulation model to generate

new attentional states from the previous ones, gaze

information and features of the object, with the use of a

orward in time) as

described in [2]. The basic specification of the

reasoning model can be expressed by the

(belief that I leads to

I and J are both information

espond to any concept

or has_value(av(1,2),

representation at(I, T) gives

on the world (including human processes)

It represents a belief that

holds at time point T. For example,

expresses that at time point 53, the

is at the space with coordinates {1,2}.

attention level

locations

2.3. Model to Determine Discrepancy

With this model the agent determines the

discrepancy between actual and desirable attentional

states and to what extent the attention distribution has

to change. This is based on a model for the desirable

attention distribution (prescriptive model). For the case

addressed this means an assessment of which objects

deserve attention (based on features as distance, speed

and direction). To be able to make such assessments,

the agent is provided with some tactical domain

knowledge, in terms of heuristics (also see Section 3.1)

2.4. Decision Model for Attention Adjustment

Given a desire to adjust the attention distribution,

the agent uses this model to determine how the inputs

for the attention model have to be changed. Besides

output from the model in Section 2.3, it uses the

structure of the model described in Section 2.1.

Relations between variables within the latter model are

followed in a backward manner, thereby propagating

the desired adjustment from the attentional state

variable to the features of the object at the screen. This

is a form of desire refinement: starting from a (root)

variable, step-by-step a desire on adjusting a (parent)

variable is refined to desires on adjustments of the

(children) variables on which the (parent) variable

depends, until the leave variables are reached. Starting

point is the adjustment of the attentional state av(s), see

the following reasoning rule:

desire(av(s)>h) ∧ belief(has_value(av(s), v)) ∧ v<h →→ desire(adjust_by(av(s), (h-v)/v)

Note that here the adjustment is taken relative.

Suppose as a point of departure an adjustment ∆v1 is

desired, and that v1 depends on two variables v11 and

v12 that are adjustable (the non-adjustable variables

can be left out of consideration). Then by elementary

calculus the following linear approximation can be

obtained:

∆v1 = ∆v11 + ∆v12 This is used to determine the desired adjustments

∆v11 and ∆v12 from ∆v1, where by weight factors µ11

and µ12 the proportion can be indicated in which the

variables should contribute to the adjustment:

∆v11/∆v12 = µ11/µ12. Since then

∆v1 = +,-

+,-- ∆v12 µ11/µ12 + +,-

+,-� ∆v12 = ( +,-

+,-- µ11/µ12 + +,-

+,-� ) ∆v12

the adjustments can be made as follows:

∆v12 = ∆.1

/01/011 µ11/µ12 � /01

/014

∆v11 = ∆.1

/01/011 � /01

/014 µ12/µ11

Special cases are µ11 = µ12 = 1 (absolute equal

contribution) or µ11 = v11 and µ12 = v12 (relative

equal contribution: in proportion with their absolute

values). As an example, consider a variable that is the

weighted sum of two other variables (as is the case, for

example, for the aggregation of the effects of the

features of the objects on the attentional state): v1 =

w11v11 + w12v12. For this case, the partial

derivatives are w11 respectively w12; therefore

∆v11 = ∆,-

5-- 6 5-� µ-�/µ-- ∆v12 = ∆,-

5-- µ--/µ-� 6 5-�

When µ11 = µ12 = 1 this results in ∆v11 = ∆v12 =

∆v1/( w11+ w12 ), and when in addition w11 + w12

= 1, it holds ∆v11 = ∆v12 = ∆v1. Another setting,

which actually has been used in the model, is to take

µ11 = v11 and µ12 = v12. In this case the adjustments

are assigned proportionally; for example, when v1 has

to be adjusted by 5%, also the other two variables on

which it depends need to contribute an adjustment of

5%. Thus the relative adjustment remains the same

through propagations:

∆,-- ,-- =

∆,- 5-- 6 5-� ,-�/,-- / v11 =

∆,- ,-

This shows the general approach on how desired

adjustments can be propagated in a backward manner.

For the case study undertaken, a desired adjustment of

the attentional state is related to adjustments in the

features of the displayed objects. An example of a

(temporal logical) reasoning rule specified to achieve

this propagation process is:

desire(adjust_by(u1, a)) ∧ belief(depends_on(u1, u2)) →→ desire(adjust_by(u2, a))

Here the adjustments are taken relative, so, this rule

is based on ∆u2 / u2 = ∆u1 / u1 as derived above.

When at the end the leaves are reached, (represented

by the belief that they are directly adjustable), an

intention to adjust them is derived.

desire(adjust_by(u, a)) ∧ belief(directly_adjustable(u)) →→ intention(adjust_by(u, a))

If an intention to adjust a variable u by a exists with

current value b, the new value b+ α*a*b to be assigned

to u is determined; here α is a parameter that allows

tuning the speed of adjustment:

intention(adjust_by(u, a)) ∧ belief(has_value_for(u, b)) →→

performed(assign_new_value_for(u, b+ α*a*b))

This rule is applied for variables that describe

features f of objects at spaces s (instances for u of the

form feature(s, f)). Note that the adjustment is

propagated as a value relative to the overall value.

3. Case Study

To test the approach in a real-world situation, a case

study with human subjects while executing the Tactical

Picture Compilation Task has been undertaken. In

Section 3.1 the environment is explained, Section 3.2

discusses some implementation details of the agent

tailored to the environment, and in Section 3.3 the

results are described based on different examples.

Figure 3. Interface of the task environment

3.1. Environment

The task used for this case study is an altered

version of the identification task described in [11] that

has to be executed in order to build up a tactical picture

of the situation, i.e. the Tactical Picture Compilation

Task (TPCT). The implementation of the software was

done in Gamemaker [19]. In Figure 3 a snapshot of the

interface of the task environment is shown. The goal is

to identify the five most threatening contacts (ships). In

order to do this, participants monitor a radar display of

contacts in the surrounding areas. To determine if a

contact is a possible threat, different criteria have to be

used. These criteria are the identification criteria

(idcrits) that are also used in naval warfare, but are

simplified in order to let naive participants learn them

more easily. These simplified criteria are the speed

(depicted by the length of the tail of a contact),

direction (pointer in front of a contact), distance of a

contact to the own ship (circular object), and whether

the contact is in a sea lane or not (in or out the large

open cross). Contacts can be identified as either a

threat (diamond) or no threat (square).

3.2. Implementation

The agent was further developed and evaluated

using Matlab. The output of the environment described

in Section 3.1 was used and consisted of a

representation of all properties of the contacts visible

on the screen, i.e. speed, direction, if it is in a sea lane

or not, distance to the own ship, location on the screen

and contact number. In addition, data from a Tobii x50

eye-tracker [18] were retrieved from a participant

executing the TPC task. All data were retrieved several

times per second and were used as input for the models

within the agent. Once the model for adjustment of

attention (model 4) was tailored to the TPC case study,

the eventual implementation of it was done in C#. The

output of the implementation causes the saliency of the

different objects on the screen to either increase or

decrease, which may result in a shift of the

participant’s visual attention. As a result, the

participant’s attention is continuously manipulated in

such a way that it is expected that he pays attention to

the objects that are considered relevant by the agent.

The increase or decrease of the saliency of objects can

be done on a continuous or discrete scale, with a binary

scale as being discrete. An illustration of the output of

this implementation with attention manipulation on a

continuous scale is described below. The results of the

implementation with attention manipulation on a

binary scale have been omitted for reasons of space,

but are investigated nonetheless in Sections 5 and 6.

3.3. Results

The first results of the agent implemented for this

case study are best described by a number of example

snapshots of the outcomes of the models used in the

agent to estimate (model 1) and manipulate (model 4)

attention in three different situations over time (see

Figure 4).

On the left side of Figure 4 the darker dots

correspond to the agent’s estimation of those contacts

to which the participant is paying attention. On the

right side of the figure, the darker dots correspond to

those contacts where attention manipulation is initiated

by the system (in this case, by increasing its saliency).

On both sides of the figure a cross corresponds to the

own ship, a star corresponds to the eye point of gaze,

and the x- and y-axes represent the coordinates on the

interface of the TPCT. In the pictures to the left, the z-

axis represents the estimated amount of attention. The

darker dots on the left side are a result of the

exceedance of this estimation of a certain threshold (in

this case .03). Thus, a peak indicates that it is estimated

that the participant has attention for that location.

Furthermore, from top to bottom, the following three

situations are displayed in Figure 4:

1. After 37 seconds since the beginning of the

experiment, the participant is not paying attention

to region A at coordinates (7.5,1.5) and no

attention manipulation for region A is initiated by

the system.

2. After 39 seconds, the participant is not paying

attention to region A, while the attention should be

allocated to region A, and therefore attention

manipulation for region A is initiated by the

system.

3. After 43 seconds, the participant is paying

attention to region A, while no attention

manipulation for region A is done by the system,

because this is not needed anymore.

Figure 4. Estimation of the participant’s attention division and the agent’s reaction

The output of the attention manipulation system and

the resulting reaction in terms of the allocation of the

participant’s attention in the above three situations,

show what one would expect of an accurate system of

attention manipulation. As shown in the two pictures at

the bottom of Figure 4, in this case the agent indeed

succeeds in attracting the attention of the participant:

both the gaze (the star in the bottom right picture) and

the estimated attention (the peak in the bottom left

picture) shift towards the location that has been

manipulated.

4. Validation

In order to validate the agent’s manipulation model,

the results from the case study have been used and

tested against results that were obtained in a similar

setting without manipulation of attention. The basic

idea was to show that the agent’s manipulation of

attention indeed results in a significant improvement of

human performance. Human performance in selecting

the five most threatening contacts was compared

during two periods of 10 minutes (with and without

manipulation, respectively). The type of manipulation

was based on determining the saliency of the objects

on a binary scale. In this way it was easy (opposed to a

continuous scale) to follow the agent’s advice. The

performance measure took the severity of an error into

account. Taking the severity into account is important,

because for instance selecting the least threatening

contact as a threat is a more severe error than selecting

the sixth most threatening contact. This was done by

the use of the following penalty function (P):

78 � 98∑ 9:�;:

� !<��8 = > � ?2 �

∑ 9:�;:

where px is the prenormalised penalty of contact x and

tx is the threat value of contact x (there are 24

contacts). Human performance is then calculated by

adding all penalties of the contacts that are incorrectly

selected as one of the top five threats and subtracting

them from 1.

After the above alterations, the average human

performance over all time points of the condition

“support” was compared with the average human

performance of the first condition “no support”, where

“support” (M+ = .8714, SD+ = .0569) was found

significantly higher than “no support” (M− = .8541,

SD− = .0667), with t(df = 5632) = 10.46, p = 0. Hence

significant improvements were found comparing the

first and the second condition. Finally, subjective data

based on a questionnaire pointed out that the

participant preferred the “support” condition above that

of the “no support” condition.

5. Verification

In addition to this validation, the results of the

experiment have been analysed in more detail by

converting them into formally specified traces (i.e.,

sequences of events over time), and checking relevant

properties, expressed as temporal logical expressions,

against these traces. To this end, a number of

properties were logically formalised in the language

TTL [3]. This predicate logical language supports

formal specification and analysis of dynamic

properties. TTL is built on atoms referring to states of

the world, time points and traces, i.e. trajectories of

states over time. In addition, dynamic properties are

temporal statements that can be formulated with

respect to traces based on the state ontology Ont in the

following manner. Given a trace γ over state ontology

Ont, the state in γ at time point t is denoted by state(γ, t). These states can be related to state properties via the

formally defined satisfaction relation denoted by the

infix predicate |=, comparable to the Holds-predicate in

the Situation Calculus: state(γ, t) |= p denotes that state

property p holds in trace γ at time t.

Based on these statements, dynamic properties can

be formulated in a formal manner in a sorted first-order

predicate logic, using quantifiers over time and traces

and the usual first-order logical connectives such as ¬,

∧, ∨, ⇒, ∀, ∃. To give a simple example, the property

‘there is a time point t in trace 1 at which the estimated

attention level of space {1,2} is 0.5’ is formalised as

follows (see [3] for more details):

∃t:TIME state(trace1,t) |= belief(has_value(av(1,2), 0.5))

Below, a number of such dynamic properties that

are relevant to check the agent’s attention manipulation

are formalised in TTL. To this end, some abbreviations

are defined:

discrepancy_at(γ:TRACE, t:TIME, x,y:COORDINATE) ≡

∃a,h:REAL estimated_attention_at(γ,t,x,y,a) &

state(γ,t) |= desire(has_value(av(x,y), h)) & a<h

This predicate states that at time point t in trace γ, there is a discrepancy at space {x,y}. This is the case

when the estimated attention at this space is smaller

than the desired attention. Next, abbreviation

estimated_attention_at is defined:

estimated_attention_at(γ:TRACE,t:TIME,

x,y:COORDINATE, a:REAL)

≡ state(γ,t) |= belief(has_value(av(x,y),a))

This takes the estimated attention as calculated by

the agent at runtime. This means that this definition

can only be used under the assumption that this

calculation is correct. Since this is not necessarily the

case, a second option is to calculate the estimated

attention during the checking process, based on more

objective data such as the gaze data and the features of

the contacts.

Based on these abbreviations, several relevant

properties may be defined. An example of a relevant

property is the following (note that this property

assumes a given trace γ, a given time point t, and a

given space {x,y}):

PP1 (Discrepancy leads to Efficient Gaze Movement) If there is a discrepancy at {x,y} and the gaze is currently at {x2,y2},

then within δ time points the gaze will have moved to another space

{x3,y3} that is closer to {x,y} (according to the Euclidean distance).

PP1(γ:TRACE, t:TIME, x,y:COORDINATE) ≡

∀x2,y2:COORDINATE

discrepancy_at(γ,t,x,y) &

state(γ,t) |= gaze_at(x2,y2) & t < LT-δ

⇒ ∃t2:TIME ∃x3,y3:COORDINATE [ t<t2<t+δ &

state(γ,t2) |= gaze_at(x3,y3) &

√((x-x2)2+(y-y2)

2) > √((x-x3)

2+(y-y3)

2)]

In the above property, a reasonable value should be

chosen for the delay parameter δ. Ideally, δ equals the

sum of 1) the time it takes the agent to adapt the fea-

tures of the contacts and 2) the person’s reaction time.

To enable automated checks, a special software

environment for TTL exists, featuring both a Property

Editor for building and editing TTL properties and a

Checking Tool that enables formal verification of such

properties against traces [3]. Using this TTL Checking

Tool, properties can be automatically checked against

traces generated from any case study. In this paper the

properties were checked against the traces from the

experiment described in Section 5. When checking

such properties, it is useful to know not only if a

certain property holds for a specific space at a specific

time point in a specific trace, but also how often it

holds. This will provide a measure of the

successfulness of the system. To check such more

statistical properties, TTL offers the possibility to test a

property for all time points, and sum the cases that it

holds. Via this approach, PP1 was checked against the

traces of the experiment with δ = 3.0 sec. These checks

pointed out that (under the “support” condition) in

88.4% of the cases that there was a discrepancy, the

gaze of the person changed towards the location of the

discrepancy. Under the “no support” condition, this

was around 80%.

6. Formal Analysis

The results of validation and verification discussed

above may ask for a more detailed analysis. In

particular, the question may arise of how a difference

between 80% without support and 88% with support as

reported above should be interpreted. Here a more

detailed formal analysis is given that supports the

context for interpretation of such percentages. To this

end the effect of arbitrary transitions in gaze dynamics

is analysed, in particular those that occur between the

time points of monitoring the gaze and adjustment of

luminance.

At a given time point, the adjustment of luminance

is based on the gaze at that point in time. A question is

whether at the time the luminance is actually adjusted,

the gaze is still at the same point. When the system is

very fast in adjusting the luminance this may be the

case. However, it is also possible that even in this very

short time the gaze has changed to focus on another

location on the screen. Here it is analysed in how many

cases of an arbitrarily changed gaze still the luminance

adjustment by the system should be sufficient. The

general idea is that this is the case as long as the gaze

transition does not increase the distance between gaze

location and considered discrepancy location. The area

of all locations of the screen for which this is the case

is calculated mathematically below; here the worst case

is analysed, the case when the considered discrepancy

location is at the corner of the screen. The screen is

taken as a square. The function f indicates an under-

approximation of the number (measured by the area) of

locations with distance at most r of O. For r ≤ d the

area within distance r of O is a quarter of a circle: π/4

r2; so f(r) = π/4 r2 for r≤ d. For r>d an approximation

was made. The part of distance larger to O than r is

approximated by two triangles as ∆PQR in Figure 5.

ON = d QN = √(r2 – d2) QR = RN - QN = d - √(r2 – d2) OR = √2*d PR = OR – OP = √2*d – r ∆PQR = ½ PR* h with h the distance of Q to OR.

h = ½√2*QR = ½ √2 * (d - √(r2 – d2))

Figure 5. Gaze area approximation

The whole area –2∆PQR is

d2 – ½ √2 * (d - √(r2 – d2))(√2*d – r)

Therefore it is taken

f(r) = d2 – ½ √2 * (d - √(r2 – d2))(√2*d – r) for r>d

For d = 10 the overall function f divided by the

overall area d2 (thus normalising it between 0 and 1) is

shown in Figure 6. For example, it shows that when r =

½d, then the covered area is around 20% of the overall

screen, but when r is a bit larger, for example r = d,

then at least around 80% is covered. Note that this is a

worst case analysis with the location considered in the

corner. In less extreme cases the situation can differ.

When, for example, the considered location is at the

center, then for distance r = ½ d, the covered area

would be a full circle with radius ½d, so an area of π/4

d2, which is more than 70% of the overall area.

Figure 6. Function of the number of locations within distance r of O, divided by d

2, for d=10

Moreover, the distance of the considered location

where a discrepancy is detected to the actual gaze may

not have a uniform probability distribution from 0 to

√2*d. Indeed, the value 0 may be very improbable, and

the larger values may have much higher probabilities.

Suppose p(r) denotes the probability (density) that the

distance between actual gaze and considered

discrepancy location is r, then the expected coverage

can be calculated by:

@ 9�� A B��√�ACD

For example, if a probability distribution is assumed

that is increasing in a modest, linear way from p(0) = 0

to p(√2*d) = 1/d2, then for d = 10 with p(r) = r/100

this becomes approximately (estimated by numerical

integration):

@ EAF�E�-DD ��-;

D = 0.72

This means that the expected coverage would be

72%. For a bit less modest increase, for example in a

quadratic manner for d = 10 from p(0) = 0 to p(14) =

0.2, then the expected coverage is approximately 80%

(estimated by numerical integration):

@ E�AF�E�-DDD ��-;

D = 0.80

When it turns out that the gaze is often changing,

then a remedy is to base the adjustment of the

luminance on a larger distance for r, thus anticipating

on the possible future states. The graph for f shows that

if r is taken equal to distance d, then a coverage of 80%

is achieved.

7. Discussion

An important task in the domain of naval warfare is

the Tactical Picture Compilation Task, where persons

have to deal with a lot of complex and dynamic

information at the same time. To obtain an optimal

performance, an intelligent agent can provide aid in

such a task. This paper presented an initial version of

0

0.2

0.4

0.6

0.8

1

1.2

1 6 11

P

Q

R

O N d

h

such a supporting software agent. Within this type of

agent an explicitly represented model of human

functioning plays an important role, for the case

considered here the model of the human’s attention.

To obtain a software agent for these purposes, four

models have been designed that are aimed at

manipulating a person attention at a specific location:

(1) a dynamical system model for attention, (2) a

reasoning model to generate beliefs about attentional

states using the attention model for forward simulation,

(3) a discrepancy assessment model, and (4) a decision

reasoning model, again using the attention model, this

time for backward desire propagation. The first two

models were adopted from earlier work [5].

The ambient agent has been implemented in a case

study where participants perform a simplified version

of the Tactical Picture Compilation Task. Within this

case study an experiment was conducted to validate the

agent’s manipulation. The participants, both in the

experiment discussed in this paper as well in earlier

pilot studies, reported to be confident that the agent’s

manipulation indeed is helpful. The results of the

validation study with respect to performance

improvement have also been positive.

Further investigation has to be done in order to rule

out any order effects, which suggests more research

with more participants. It is also expected that future

improvements of the agent’s submodels, based on the

gained knowledge from automated verification will

also contribute to the improved success of such

validation experiments.

Finally, a detailed analysis and verification of the

behaviour of the agent also provided positive results.

Traces of the experiment were checked to see whether

the agent was able to adapt the features of objects in

such a way that they attracted human attention. Results

show that when there was a discrepancy between the

prescriptive and the descriptive model of attention, the

agent indeed was able to attract the human’s attention.

References

[1] Baron-Cohen, S. Mindblindness: an essay on autism and theory of mind. MIT Press, 1995.

[2] Bosse, T., Both, F., Gerritsen, C., Hoogendoorn, M., and

Treur, J. Model-Based Reasoning Methods within an

Ambient Intelligent Agent Model. In: M. Mühlhäuser et al.

(eds.), Constructing Ambient Intelligence: AmI-07

Workshops Proceedings. LNCS, vol. 11, Springer Verlag, 2008, pp. 352-370.

[3] Bosse, T., Jonker, C.M., Meij, L. van der, Sharpanskykh,

A., and Treur, J., Specification and Verification of Dynamics

in Agent Models. International Journal of Cooperative

Information Systems, vol. 18, 2009, pp. 167 - 193. Shorter

version in: Nishida, T. et al. (eds.), Proc. of the Sixth Int.

Conference on Intelligent Agent Technology, IAT'06. IEEE Computer Society Press, 2006, pp. 247-254.

[4] Bosse, T., Memon, Z.A., and Treur, J., A Two-Level

BDI-Agent Model for Theory of Mind and its Use in Social

Manipulation. In: Proceedings of the AISB 2007 Workshop on Mindful Environments, 2007, pp 335-342.

[5] Bosse, T., Maanen, P.-P. van, Treur, J., Simulation and

Formal Analysis of Visual Attention, Web Intelligence and Agent Systems Journal, vol. 7, 2009, pp. 89-105.

[6] Heuvelink, A., Both, F.: Boa: A cognitive tactical picture

compilation agent. In: Proc. of the Int. Conf. on Intelligent

Agent Technology, IAT ’07, IEEE Comp. Soc. Press, 2007.

[7] Itti, L. and Koch, C., Computational Modeling of Visual

Attention, Nature Reviews Neuroscience, Vol. 2, No. 3, 2001, pp. 194-203.

[8] Levitt, J.B., and Lund, J.S. Contrast dependence of

contextual effects in primate visual cortex. Nature, 387, 1997, 73-76.

[9] Marsella, S.C., Pynadath, D.V., and Read, S.J.,

PsychSim: Agent-based modeling of social interaction and

influence. In: Lovett, M., Schunn, C.D., Lebiere, C., and

Munro, P. (eds.), Proc. of the Int. C. on Cognitive Modeling, ICCM 2004, pp. 243-248 Pittsburg, Pensylvania, USA.

[10] Memon, Z.A., and Treur, J., Cognitive and Biological

Agent Models for Emotion Reading. In: Jain, L., Gini, M.,

Faltings, B.B., Terano, T., Zhang, C., Cercone, N., Cao, L.

(eds.), Proceedings of the 8th IEEE/WIC/ACM International

Conference on Intelligent Agent Technology, IAT'08. IEEE Computer Society Press, 2008, pp. 308-313.

[11] Nothdurft, H. Salience from feature contrast: additivity

across dimensions. Vision Research 40, 2000, 1183-1201.

[12] Parkurst, D., Law, K., and Niebur, E. Modeling the role

of salience in the allocation of overt visual attention. Vision Research, 42, 2002, 107-123.

[13] Posner, M. E., Orienting of attention, Q. J. Exp. Psychol., 32, 1980, pp. 3-25.

[14] Theeuwes, J., Endogenous and exogenous control of visual selection, Perception, 23, 1994, pp. 429-440.

[15] Theeuwes, J. Abrupt luminance change pops out; abrupt

color change does not, Perception & Psychophysics, 57(5),

1995, 637-644.

[16] Treisman, A. Features and objects: The 14th Bartlett

memorial lecture. Q.J. Experimental Psychology A. 40, 201-237.

[17] Turatto, M., and Galfano, G., Color, form and luminance

capture attention in visual search. Vision Research, 40, 2000, 1639-1643.

[18] http://www.tobii.com.

[19] http://www.yoyogames.com/gamemaker.

Attention Manipulation for Naval Tactical Picture Compilation

Documents

Transcript of Attention Manipulation for Naval Tactical Picture Compilation