Cognitive task load in a naval ship control centre: from identification to prediction

27
Cognitive task load in a naval ship control centre: from identification to prediction M. GROOTJEN*{{, M. A. NEERINCXx{ and J. A. VELTMANx {Defence Materiel Organization, Directorate Materiel Royal Netherlands Navy, Department of Naval Architecture and Marine Engineering, PO Box 20702, 2500 ES The Hague, The Netherlands {Technical University of Delft, PO Box 5031, 2628 CD Delft, The Netherlands xTNO Human Factors, Kampweg 5, PO Box 23, 3769 ZG Soesterberg, The Netherlands Deployment of information and communication technology will lead to further automation of control centre tasks and an increasing amount of information to be processed. A method for establishing adequate levels of cognitive task load for the operators in such complex environments has been developed. It is based on a model distinguishing three load factors: time occupied, task-set switching, and level of information processing. Application of the method resulted in eight scenarios for eight extremes of task load (i.e. low and high values for each load factor). These scenarios were performed by 13 teams in a high-fidelity control centre simulator of the Royal Netherlands Navy. The results show that the method provides good prediction of the task load that will actually appear in the simulator. The model allowed identification of under- and overload situations showing negative effects on operator performance corresponding to controlled experiments in a less realistic task environment. Tools proposed to keep the operator at an optimum task load are (adaptive) task allocation and interface support. Keywords: Mental load; Task analysis; Human–computer interaction; Cognitive engineering; Task allocation; Ship control centre 1. Introduction Because of ongoing automation in process control, fewer personnel have to manage high-demand situations and supervise complex systems. Reduced manning concepts appear based on the notion that the information and communication technology can take over and support operator tasks. However, information processing demands appear to increase substantially for the operators because of the availability of *Corresponding author. Email: [email protected] Ergonomics Vol. 49, Nos. 12–13, 10–22 October 2006, 1238–1264 Ergonomics ISSN 0014-0139 print/ISSN 1366-5847 online ª 2006 Taylor & Francis http://www.tandf.co.uk/journals DOI: 10.1080/00140130600612705

Transcript of Cognitive task load in a naval ship control centre: from identification to prediction

Cognitive task load in a naval ship controlcentre: from identification to prediction

M. GROOTJEN*{{, M. A. NEERINCXx{ and J. A. VELTMANx

{Defence Materiel Organization, Directorate Materiel Royal Netherlands Navy,

Department of Naval Architecture and Marine Engineering, PO Box 20702,

2500 ES The Hague, The Netherlands

{Technical University of Delft, PO Box 5031, 2628 CD Delft, The Netherlands

xTNO Human Factors, Kampweg 5, PO Box 23, 3769 ZG Soesterberg,

The Netherlands

Deployment of information and communication technology will lead to

further automation of control centre tasks and an increasing amount of

information to be processed. A method for establishing adequate levels of

cognitive task load for the operators in such complex environments has been

developed. It is based on a model distinguishing three load factors: time

occupied, task-set switching, and level of information processing. Application

of the method resulted in eight scenarios for eight extremes of task load

(i.e. low and high values for each load factor). These scenarios were

performed by 13 teams in a high-fidelity control centre simulator of the

Royal Netherlands Navy. The results show that the method provides good

prediction of the task load that will actually appear in the simulator. The

model allowed identification of under- and overload situations showing

negative effects on operator performance corresponding to controlled

experiments in a less realistic task environment. Tools proposed to keep

the operator at an optimum task load are (adaptive) task allocation and

interface support.

Keywords: Mental load; Task analysis; Human–computer interaction;

Cognitive engineering; Task allocation; Ship control centre

1. Introduction

Because of ongoing automation in process control, fewer personnel have to manage

high-demand situations and supervise complex systems. Reduced manning concepts

appear based on the notion that the information and communication technology can

take over and support operator tasks. However, information processing demands

appear to increase substantially for the operators because of the availability of

*Corresponding author. Email: [email protected]

Ergonomics

Vol. 49, Nos. 12–13, 10–22 October 2006, 1238–1264

ErgonomicsISSN 0014-0139 print/ISSN 1366-5847 online ª 2006 Taylor & Francis

http://www.tandf.co.uk/journalsDOI: 10.1080/00140130600612705

ever-increasing amounts of information that have to be processed, the increased scope

of actions, and the ever-increasing costs of errors in an environment with possibly

ambiguous and insecure information (cf. Neerincx and Griffioen 1996). The central

question is how to address human factors systematically in the development and main-

tenance processes of such complex and dynamic human–machine systems in order to

realize optimal operational effectiveness and efficiency.

An extensive and diverse set of human factors methods and tools have been identified

and proposed for the design of tasks and user interfaces, for example from the per-

spective of (cognitive) task analysis (Kirwan and Ainsworth 1992, Schraagen et al. 2000,

Hollnagel 2003), human–computer interactions (Helander et al. 1997, Jacko and

Sears 2003), and usability engineering (Mayhew 1999, Maguire 2001, Rosson and

Carroll 2001). We propose a cognitive engineering approach, in which the human factors

engineering activities are tailored to the domain specifics and continuously improved by

empirical studies. Two examples are of interest in this regard. Koubek et al. (2003)

address human factors within a framework and theoretically based software tool

which provide engineers and designers with easy access to the most recent advance in

human–machine interface design. Another good example is presented by Neerincx et al.

(2003c) who developed the cognitive and functional framework (COLFUN) for

envisioning and assessing high-demand situations in order to realize adequate human

resource deployment. Application of the framework (Rypkema et al. 2002) showed how

COLFUN supports the integration of human factors in the iterative development process

of complex human–machine design for a traffic control centre.

In the current research we show how to address human factors in the design process

by the application and validation of a cognitive task load (CTL) method in a complex

domain. This human-centred design method (Neerincx 2003) is based on a CTL model

and aims at an optimal CTL for the operators at all times. The method was evaluated in

a complex partially automated task environment in process control—a high-fidelity

ship control centre (SCC) simulator in which platform systems are supervised and

damage control activities are planned and coordinated.

1.1. Cognitive task load model

The CTL model (figure 1) distinguishes three load factors that have a substantial effect on

task performance and mental effort (Neerincx 2003). The first classical load factor,

percentage time occupied (TO), has been used to assess workload in practice for timeline

assessments. Such assessments are often based on the notion that people should not be

occupied for more than 70–80% of the total time available (Beevis 1992). The second load

factor is the level of information processing (LIP). To address cognitive task demands, the

cognitive load model incorporates the skill–rule–knowledge framework of Rasmussen

(1986). In this framework, LIP is divided into three levels: skill-based, rule-based,

and knowledge-based. At the skill-based level, information is processed automatically,

resulting in actions that have little cognitively demand. At the rule-based level,

input information triggers routine solutions (i.e. procedures with rules of the type

‘if 5event/state4 then 5action4’), resulting in efficient problem-solving in terms of the

required cognitive capacities. At the knowledge-based level, the problem is analysed

and solution(s) are planned, in particular to deal with new situations. This type of

information processing can involve a high load on the limited capacity of working

memory. To address the demands of attention shifts, the cognitive load model

distinguishes task-set switching (TSS) as a third load factor. Complex task situations

Cognitive task load in a ship control centre 1239

consist of several different tasks with different goals. These tasks appeal to different

sources of human knowledge and capacities, and refer to different objects in

the environment. We use the term ‘task-set’ to denote the human resources and

environmental objects with the momentary states which are involved in the task

performance.

Figure 1 depicts a three-dimensional (3D) ‘load space’ in which human activities can be

projected with regions indicating the cognitive demands that the activity imposes on the

operator. It should be noted that these factors represent task demands which affect

human operator performance and effort (i.e. it is not a definition of the operator cognitive

state). In the middle area, CTL matches the operator’s mental capacity. At corner 8 CTL

is high (TO, TSS, and LIP are high) and an overload occurs. Corner 1 represents the area

in which, because of underload, CTL is not optimal. When TO is high, and LIP and TSS

are low, vigilance problems can appear (corner 2) (Levine et al. 1973, Parasuraman 1986).

When TO and TSS are high, cognitive lock-up can appear,i.e. the tendency of people to

focus on single faults, ignoring the other subsystems to be controlled (line 4–8) (Boehne

and Paese 2000, Kerstholt and Passenier 2000).

Higher CTL does not automatically result in a reduced level of performance.

The effects of increased CTL can be counteracted by an increase in effort. Thus the

performance effects must be related to the effort scores (Zijlstra 1993, Veltman and

Gaillard 1996). In evaluation of performance both speed and accuracy have to be

considered. The speed–accuracy trade-off, and the inverse relation between these two

parameters, is described by Wickens (1992).

1.2. CTL method

Neerincx (2003) developed the CTL method to guide human–machine development

processes in order to realize acceptable levels of task load for process control operators.

The core of the method is the CTL model described above.

CTL can only be analysed for specific concrete task contexts. An effective method of

creating such a context is the use of scenarios (Carroll 2000). Scenarios presuppose

Figure 1. Dimensions of the CTL model showing time occupied (TO), task-set switches

(TSS), and level of information processing (LIP). Neerincx (2003) distinguishes several

critical regions: underload, vigilance, cognitive lock-up, and overload. The numbers (1–8)

represent the scenarios used in the experiment.

1240 M. Grootjen et al.

a certain setting, within which the roles are played by actors. In complex scenarios

different actors can be involved, possibly interacting with each other. Actors have specific

goals or tasks, and actions have to be taken to achieve these goals. Neerincx (2003)

provides a CTL method and a description format for the systematic creation and

assessment of normal and critical situations with their corresponding action sequences.

Such an action sequence displays the actions of different actors, including their

interactions with support systems, on a timeline. The actions can be triggered by events,

and are grouped according to their higher-level task (goal). Figure 2 shows a (simplified)

part of an action sequence diagram. The actors are displayed on the horizontal axis

(system, operator 1, operator 2, etc.). The timeline is represented by the vertical axis.

The different task-sets can be distinguished by colours or by different types of line.

The LIP levels are presented in various shades of grey.

1.3. Application and validation

The CTL model and method can predict whether future task demands are attuned to

the limited human information processing capacities. This model and method are

derived from cognitive research in different task domains. Our approach is to conduct

experiments in both controlled laboratory settings and in more complex realistic settings

to systematically test the theoretical foundation and investigate its application in the real

world. In this approach, the test environment increases in complexity and therefore

Figure 2. Small portion of an action sequence diagram using the CTL method

(Neerincx 2003).

Cognitive task load in a ship control centre 1241

decreases in controllability. In this way we can both test and refine the theory and achieve

a good understanding of its applicability in practice.

The model was validated in two experiments. The first experiment providing empirical

support for the CTL model was a simple laboratory task called the ‘alarm 112 task’

(Neerincx et al. 2003a). The participant was in charge of three emergency services

(fire brigade, ambulance, and police) and had to deal with different types of crises. Each

service consisted of a team of four ‘virtual’ persons. The experiment had nine different

conditions, givng a combination of low, medium, and high scores for the load factors LIP

and TSS. The second experiment used a test environment with computer tasks explicitly

exhibiting important features from damage control on ships, called the ‘SCC computer

task’ (Neerincx and van Besouw 2001). In this task the user plays the role of damage

control manager, supervising platform systems on the ship. Eight scenario types were

designed for the SCC task, one for each corner of the model. Both laboratory

experiments showed that LIP and TSS can affect operator performance and mental effort

substantially in addition to TO. Furthermore, in the experiments the negative effects of

the load factors reinforced each other.

The CTL method has been used for a variety of purposes. Neerincx and Passenier

(2000) found that the method was useful for task allocation in the design process of an air

defence and command frigate (ADCF) for the Royal Netherlands Navy (RNlN).

Grootjen et al. (2002) used the method to develop user interface support which proved to

have a substantial added value for task performance. Neerincx et al. (2003b) presented a

scenario-based tool which is able to calculate load distributions (including possible

occurrences of momentary peak values) and overall execution time for a particular crew.

This tool is based on the CTL method and can be of great value in the design of control

centres. For example, van Veenendaal (2002) assessed the action sequences for alternative

designs of the bridge of a naval ship, comprising different task allocations and support

functions for navigation and platform supervision. The analysis showed that, under

normal conditions, the task of the bridge officer could be extended to include platform

control tasks.

1.4. Current research: high-fidelity SCC experiment

The RNlN is maintaining and developing various classes of frigates, ranging from

standard frigates succeeded by the multipurpose frigate (M-frigate) and the new ADCF.

These frigates have a SCC in which CTL can vary enormously from one extreme to the

other and therefore will be an important factor in the effectiveness of the human problem-

solving process. SCC occupation depends on the readiness state, and consists of two to

six people. The readiness state is determined by the ship’s commander and depends on the

situation (i.e. narrow channel, hostile threats). The technical school of the RNlN has a

high-fidelity SCC simulator. This simulator makes it possible to perform experiments in

complex realistic settings and to systematically test the theoretical foundation of the

model and method and investigate applications in the real world. We conducted

experiments to test the effects of the task characteristics distinguished by the model on

SCC task performance and subjective mental effort (SME) during different readiness

states. The scenarios were designed for the extremes of each of the three load factors.

This resulted in eight scenarios (figure 1).

This experiment should improve the empirical foundations of the CTL model and

method, and provide an initial estimation of the critical load values for the SCC.

We made two assumptions.

1242 M. Grootjen et al.

1. Application of the CTL method results in CTL specifications per crew member

(figure 2) which predict the actual CTL of a crew member adequately, i.e. we expect

the method to provide good predictions of the task load that will actually appear in

the SCC simulator.

2. The three load factors of the CTL model affect task performance and SME sub-

stantially, and can be used to identify underload, overload, vigilance, and cognitive

lock-up. Corresponding to the laboratory experiments, we expect an increased SME

and/or reduced performance when LIP, TSS, and TO are high.

Section 2 describes the method used in the experiment. Section 3 summarizes the results

of the experiment. Sections 4 and 5 contain the discussion, conclusions, and a description

of future work plans.

2. Method

2.1. Development of scenarios

Most of the development of the scenarios was done in cooperation with the technical

school of the RNlN. First, a task analysis was performed following the method of

Neerincx (2003) (figure 2 shows a small portion of this task analysis). Then, the scenarios

were implemented in the high-fidelity SCC simulator. A pilot experiment was performed

to test the scenarios in the training system. In this way the trainers became familiar with

the scenarios and some final adjustments could be made. Figure 3 shows the simulator

from the side where the trainers control the experiment (known as the ‘kitchen’).

The SCC simulator, which is identical to the SCC on an M-frigate, can be seen

behind the one-way mirror. At least three trainers were needed for each scenario.

Figure 3. Three of the four trainers controlling scenario 7 and 8. From left to right,

trainer 1 maintains an overview, trainer 2 communicates with the participants, and

trainer 3 activates alarms, operates the system, and communicates with the participants.

Cognitive task load in a ship control centre 1243

Two performed the scenario (activating alarms, operating the system, and communicat-

ing with the participants), and the third maintained a complete overview and made sure

that the scenario went in the predefined direction. In conditions 7 and 8, a fourth trainer

was needed for communication with the participants. In addition to the trainers, one

specialist made performance notes which were used in the evaluation. Figure 3 shows

three of the four trainers controlling scenario 7 and 8.

First, all conditions were performed by expert teams to obtain the expert performance

time. To do this, each scenario was executed by an expert operator and manager, who

had experience with the scenario and the standard procedures. The experts were those

people who had participated in the pilot.

2.2. Participants

Thirteen teams participated in the experiment. Three teams comprised experts from the

technical school, and the other 10 were active teams from the crews of the M-frigates that

were in harbour at the time of the experiment. Before and during their period on board

they receive training courses to develop and maintain skills. Each team consisted of an

operator and a manager. However, some teams had more members to make the scenarios

more realistic (e.g. more realistic for the specific operator and manager who were being

evaluated). The actions of the extra team members were not used in the evaluation of this

research. The ranks of the operators and managers are given in table 1. Their experience

varied over a wide range (approximately 2–10 years).

2.3. Task

The participants had to deal correctly with the emergencies that appeared. In all

tasks, the system is operated by the operator and the manager makes the decisions.

The manager is also responsible for all actions performed. At the time of the experiments

all team members were actively working on ships, and so they should have been

familiar with the operational procedures and working practices. As stated in section 1.4,

eight scenarios were designed for the extremes of each of the three load factors

(1–8 in figure 1). These scenarios are the eight conditions of the experiment. Table 1

shows the conditions and the scenario types, and a short summary of each condition is

given below.

Condition 1: Machinery Breakdown Drill 1. The ship is in transit when a malfunction in

the automation of the pitch controller appears. The crew should execute the ‘automation

failure’ procedure. The chief of the watch asks the bridge to make no further changes to

the ship’s speed until the problem has been identified. When this problem is solved,

another alarm appears: low pressure in the seawater main system. The correct procedure

is to open the emergency cooling valve.

Condition 2: Machinery Breakdown Drill 2. The ship is in harbour and making

preparations to leave. The bridge asks for two cruising diesels. For this a standard

procedure has to be followed: starting, checking, and finally selecting the machines. After

this, the bridge asks for the two main gas turbines, and a similar procedure has to be

followed. When sailing at high speed on the gas turbines, an alarm indicates high

temperature in the gear box. The correct procedure is to reduce power, and if the

temperature does not decrease, to perform an emergency stop.

1244 M. Grootjen et al.

Table

1.Conditionsoftheexperim

ent.

Condition

(corner)

TO

TSS

LIP

Scenario

types

Operatorand

manager

Ranks

No.of

teams

No.of

team

mem

bers

1Low

Low

Low

MBDs

Chiefofthewatch

Sergeant

52

Deputy

chief

Corporal

2High

Low

Low

MBDs

Chiefofthewatch

Sergeant

52

Deputy

chief

Corporal

3Low

High

Low

Fireatsea

DC-officer

Lieutenant

45

NBCD

operator

Sailor/Sergeant

4High

High

Low

Fireatsea

DC-officer

Lieutenant

45

NBCD

operator

Sailor/Sergeant

5Low

Low

High

MBDs

Chiefofthewatch

Sergeant

52

Deputy

chief

Corporal

6High

Low

High

MBDs

Chiefofthewatch

Sergeant

52

Deputy

chief

Corporal

7Low

High

High

Battlestations

M-officer

Lieutenant

45

Propulsionoperator

Sailor/Sergeant

8High

High

High

Battlestations

M-officer

Lieutenant

45

Propulsionoperator

Sailor/Sergeant

MBD¼Machinerybreakdowndrill.

Cognitive task load in a ship control centre 1245

Condition 3: Fire at sea 1. A fire alarm in the galley appears in the SCC. The fire is small

and is easily extinguished. Two men are injured, one of whom has an important role in

the fire-fighting organization. During the fire an important door, which should remain

closed at all times, opens.

Condition 4: Fire at sea 2. There is a fire in the front engine room. The correct procedure is

that halon should be inserted as soon as possible, and boundary cooling has a high

priority. However, some problems have to be solved first: there is an injured man is in the

engine room and people are trying to rescue him, and an air valve in the engine room is

stuck in open position. Finally, the halon is inserted, after which boundary cooling has

the highest priority.

Condition 5: Machinery breakdown drill 3. In this scenario the ship is in a high-speed

exercise. At the start an alarm indicates a high temperature in the power turbine. At the

time of the experiment, there was not a predefined procedure for this problem. However,

the correct procedure is to reduce speed. After this procedure, another alarm indicates

vibration in the power turbine. The standard procedure for this is to reduce speed, but

this has already been done. Therefore the procedure should be an emergency stop. The

final problem in this scenario is a combination of two alarms: low pressure in the cooling

water and a bilge water alarm. When these alarms appear simultaneously an emergency

stop on both propulsion shafts must be performed. The bridge allows this, but wants

propulsion recovered as soon as possible.

Condition 6: Machinery breakdown drill 4. In this scenario, the ship is sailing in a narrow

channel. The first alarm that appears is a high-temperature oil alarm for the port cruising

diesel. The standard procedure is a speed reduction or an emergency stop when the

speed is already low (as in this case). However, the bridge does not allow a stop and

alternative propulsion has to be offered first. After offering alternative propulsion on

one side, the bridge asks for the gas turbine on the other side to be started as well.

During high-speed sailing, an alarm appears in the hydraulic system. The correct

procedure is ‘automation failure’. After a short period a low-level alarm in the hydraulic

tank is activated, suggesting an oil leakage. An emergency stop must be performed,

but because there is no hydraulic oil left the procedure has become much more

complicated.

Conditions 7 and 8: Battlestations. These scenarios consist of eight alarms, each of which

is briefly explained below.

1. As a result of a leakage in the chilled water plant, the operator receives a low-

pressure and high-temperature alarm. Before he can determine where the leakage is,

the rear chilled water plant shuts down and he has to change to another

configuration. The leakage has to be repaired, but after the repair the water keeps

rising. Pumping the water outside with an eductor is not really an option because this

uses the pressure of the fire main system (see alarm 3 below).

2. Because of an automation failure, automatic pitch control is no longer possible.

The ‘automation failure’ procedure has to be followed. At the same time an error in

the fuel system controller (FSC) of the gas turbine arises. Now the pitch has to be

controlled by hand from the SCC, and the FSC has to be controlled by hand locally.

When alarm 5 appears, the pitch also has to be controlled locally.

1246 M. Grootjen et al.

3. Personnel in the front main engine room report a leakage in the fire main system.

An emergency repair is required. As long as this leakage is not repaired, some

systems (e.g. high-pressure air) are not usable.

4. A leakage in the primary steering system causes a low-pressure alarm. Extra

personnel are needed to make emergency repairs. Damage repair takes a long time.

5. Because of a software failure, the output layer of some automated systems is

disabled. This gives a remote output disable (ROD) alarm. Many components can no

longer be operated from the SCC. There will not be an audible alarm when this

problem arises, but someone has to see a burning light on the console in front of him.

When detected, the problem can be solved with a (simple) reset.

6. A leakage causes two alarms which appear almost simultaneously: a low-pressure

alarm in the cooling system (seawater) and a bilge water high alarm. Activating

emergency cooling from the SCC is not possible because of the ROD alarm

(alarm 5), and has to be done locally. After this, temperatures still increase and the

procedure for speed reduction of the port propulsion shaft should be performed.

However, the bridge does not allow this. After a while the temperature becomes so

high that an emergency stop is essential, but the bridge still rejects this. The best

solution now is to decrease propulsion as much as possible.

7. The communication system of the manager has a (software) malfunction. He should

warn a specialist to fix it and, in the mean time, find other ways to communicate.

8. As a result of problems in the gas turbine oil system, a high-temperature alarm

appears for a bearing in starboard power turbine, followed by a vibration alarm.

A procedure to reduce speed or an emergency stop has to be performed, but the

bridge does not allow either procedure. After a brief period the machine stops

automatically. The operator should choose the trailing mode for that propulsion

shaft.

2.4. Video analysis and subjective ratings

All experimental sessions were recorded on video tape. This tape was replayed after

each session, during which the operators and managers had to indicate when they started

and stopped an action. A software tool that was originally developed for a workload

analysis of Lynx helicopter crews (Veltman and Gaillard 1999) was used for this analysis.

Apart from scoring the indicating start and stop times, the participants had to give

a score on a mental effort and a task complexity scale. These rating scales appeared

successively on the computer screen every minute. The participants were instructed to

evaluate the previous minute for these ratings. The range of the rating scales was between

0 and 10 in steps of 0.25 points. A rating could be given by moving a pointer with the

arrow keys and pressing the enter key when the pointer indicated the proper rating.

At the first appearance of the scale, the arrow pointed to the value 5, and at all successive

times it pointed to the last rating entered. Therefore, when effort or complexity was

unchanged during the last minute, the participant could simply indicate this by pressing

the enter key.

It appeared that the participants were unable to indicate the beginnings and endings

of all actions, because there were too many actions to indicate. Therefore only the

ratings from the above-mentioned analysis were used. In order to establish the start

of a new task-set and the beginnings and endings of all actions properly, all video

tapes were analysed by a specialist. Each tape was replayed twice: once to score the

task-sets and actions of the operator, and once to do the same for the manager.

Cognitive task load in a ship control centre 1247

The advantage of this procedure was that the same criteria for all actions were used in

all sessions.

2.5. Variables

The independent variables are the three load factors of the CTL model:

. time occupied;

. number of task-set switches;

. level of information processing.

The CTL method was used to predict a ‘low’ and a ‘high’ level for each independent

variable. This leads to eight different conditions (i.e. scenarios), which can be visualized in

a cube (figure 1).

Five dependent variables were measured. The first three were used to determine the

actual values of the three load factors of the CTL model. These values were used to

validate whether manipulation of the independent variables with the CTL method

succeeded. The last two variables were used to identify critical situations (e.g. underload,

overload, vigilance, and cognitive lock-up situations).

1. Time occupied Video analysis revealed a timeline with all actions during a scenario.

The time occupied was defined as the total time that a participant was busy with the

actions relative to the total scenario time.

2. Number of task-set switches The number of task-set switches was identified from

the video analysis data by the experts. Every time the subjects changed task-set a

function key was pressed.

3. Complexity The complexity of the session was rated each minute. Complexity was

used to validate the independent variable LIP. The average rating in a session was

used for further analysis.

4. Subjective mental effort SME was rated each minute by the participants during

the video replay session. The average rating during a session was used for further

analysis.

5. Performance Two performance measures were used.

(a) Relative action time All conditions were performed by expert teams to obtain

the expert performance time. The members of these teams were experienced

with the type of alarms that could happen in each condition. The times that

these experts needed to perform the scenarios were used as baseline values for

the participants.

(b) Performance ratings The performance of each participant was rated by two

specialists (the instructor and experiment leader). After all the experiments

were completed, a list of important actions was made for each scenario. The

maximum number of points that could be gained for each action was mainly

based on the ship’s readiness state and the severity of resulting damage due to

human error. Subsequently, the recorded videos were evaluated and ratings

were given by two specialists. Performance notes made by another specialist

during execution of the scenario were also used in this evaluation.

The results of the validation of the method are presented in section 3.1, and the results

of SME investigations and performance measures are given in section 3.2.

1248 M. Grootjen et al.

2.6. Design

The combination of the independent variables resulted in eight different conditions.

A separate scenario had to be developed for each condition. Because it was considered

not to be possible to use the same scenario type (e.g. MBDs) for all conditions, three

different types of scenario were used (table 1). One scenario type was used for the

conditions 1, 2, 5, and 6, one for conditions 3 and 4, and one for conditions 7 and 8.

Only one operator and one manager in each team were evaluated. Thirteen teams

participated in the experiment (10 teams from the ships, and three expert teams for the

pilot and to determine the baseline times). Because of the chosen design, it was not

possible to perform all conditions with the same teams. Four teams performed conditions

1, 2, 5, and 6; three teams performed conditions 3 and 4, and the other three teams

performed conditions 7 and 8. Therefore data from different teams has been compared in

the evaluation of some factors. This may reduce the reliability of the subjective ratings

because subjects generally do not use the same ranges and baselines. For example, some

subjects will give ratings between 4 and 8 and others between 0 and 10. Therefore the

results for TSS have the lowest reliability. The results for TO are from the same teams,

and therefore have the best reliability.

2.7. Apparatus

The experiments took place in the SCC simulator of the RNlN. In addition to the

standard instruments used in the simulator, extra equipment had to be installed to make

audio and video recordings of the scenario. Furthermore, computers and video monitors

had to be installed for the evaluation session. Six cameras, a splitter, a super-VHS video

recorder, a mixing console for audio, and two clip-on microphones were used for the

video and audio recordings. Two laptops, two monitors, and two headphones were used

for the evaluation session.

2.8. Hypotheses

1. Application of the CTL method results in good predictions of the CTL that will

actually appear in the SCC simulator. The experimental set-up provides substantial

differences between low and high variations of TO, TSS, and LIP.

2. When hypothesis 1 is satisfied:

(a) each factor will effect SME and performance;

(b) We can determine critical conditions, such as underload, overload, cognitive

lock-up, and vigilance areas.

2.9. Statistical analyses

In the current experimental set-up only descriptive statistics can be calculated. The

number of teams per corner is too low and different teams have to be compared, which

might increase the variation in the data. Obviously, an experiment of this type is less

controlled and therefore there are fewer subjects. However, many interesting results can

be derived from the data. Therefore the data will be used to describe a trend in addition to

the earlier controlled experiments.

The data of one manager performing the tasks in corners 7 and 8 deviated strongly

from the other data. It appeared that this manager had received little training and was

Cognitive task load in a ship control centre 1249

not motivated during the experiment. Therefore this person was excluded from the

analysis.

2.10. Procedure

The procedure consisted of the following elements (the numbers in parentheses are the

estimated times):

. introduction (10 minutes);

. instruction (10 minutes);

. warming-up scenario (10 minutes);

. instruction of evaluation (20 minutes);

. experimental scenario (15 minutes);

. evaluation (video replay) of experimental scenario (15 minutes).

The experimental scenario and evaluation took place four times a day, for different

corners. The participants took a short break after each scenario (about 10 minutes).

Because it was hard to get both the manager and the operator in corner 8 of the workload

cube simultaneously, conditions 7 and 8 were combined in one scenario. During the first

part of this scenario the operator was in corner 7 and the manager in corner 8, in the

second part this was reversed. When the participants arrived, they were welcomed and an

introduction to the experiment was given. It was emphasized that the recorded scenarios

would only be used for the experiment, and not for operational assessment of their task

execution.

3. Results

The results can be divided into a validation section and an SME and performance

section. The extent to which the experimental manipulations were successful is

described in section 3.1. The relation between the factors in the model and SME and

performance is described in section 3.2. The numerical data for each condition are

presented in table 2.

3.1. Validation

The data for the validation sector are calculated separately for each operator and

manager. Four comparisons, corresponding to the edges of the CTL model, can be made

for each load factor. Therefore this cube is presented in each graph as legend. In every

figure, the predicted low–high levels of the independent variables are shown on the

horizontal axis. The vertical axis shows the actual measured values of the dependent

variables.

3.1.1. Time occupied. The measure of TO is defined as the sum of the duration of each

action related to the total duration of the scenario. Figure 4 shows the results for each

team member and each condition.

. For the operators, TO increased only for the teams who performed the high LIP

tasks. The average increase in TO is 3%.

. For the managers, TO increased in all conditions with an average of 6%.

1250 M. Grootjen et al.

Table

2.Averagevalues

ofallmeasuresforeach

condition.

Operator

Manager

Condition

12

34

56

78

12

34

56

78

Effort

rating

2.8

3.1

2.2

3.3

3.9

4.3

5.2

6.8

2.4

3.0

4.9

5.1

4.3

4.5

43.1

Complexityrating

2.4

2.5

1.5

2.1

3.7

3.9

4.1

62.5

3.1

4.2

54.7

4.7

4.6

3.6

Relativeactiontime

155

140

93

115

176

103

193

123

152

126

136

111

181

146

168

182

Perform

ance

5.5

86.8

6.3

6.7

6.1

6.2

66.4

8.5

6.7

6.1

8.2

7.8

4.8

7.2

TO

44.2

42.2

33.9

29.4

37.3

42.4

54.5

68.1

54.9

58.2

51.7

53.2

65.5

70.8

53.5

68.1

TSS

22

511

42

913

22

17

16

64

26

32

LIP

2.4

2.5

1.5

2.1

3.7

3.9

4.1

62.5

3.1

4.2

54.7

4.7

4.6

3.6

TO

experts

37

40

42

25

34

56

37

72

46

63

39

53

56

66

42

55

TSSexperts

12

29

32

10

41

213

83

414

18

Cognitive task load in a ship control centre 1251

3.1.2. Task-set switches. TSS is defined as the number of times a team member started

a new task-set. The results are presented in figure 5.

. TSS was much higher in the ‘high’ condition than in the ‘low’ condition, especially

for the managers.

. TSS for the operators increases with an average of 7.

. TSS for the managers increases with an average of 19.

. It should be noted that ‘low’ and ‘high’ TSS conditions are performed by different

teams.

3.1.3. Level of information processing. LIP is validated by the ‘complexity’ ratings.

Operators rated the complexity each minute during the video replay session. The average

ratings are shown in figure 6.

Figure 4. Percentage of time occupied.

Figure 5. Number of task-set switches.

1252 M. Grootjen et al.

. The operators provided much higher ratings in the ‘high’ than in the ‘low’ LIP

condition (average increase of 9.2). Moreover, the effects for high TSS (dotted lines,

triangles and diamonds) are much more pronounced than those for low TSS (solid

lines, circles and squares).

. This effect was less pronounced for the managers (average increase of 2.8); indeed, the

opposite effect was found for high TSS and high TO (dotted line, triangles).

. It should be noted that the ‘low’ and ‘high’ levels of the solid lines were obtained from

the same teams, and the levels of the dotted lines were obtained from different teams.

The strong positive slope of the dotted lines for the operators and the negative slope

of the dotted line (triangles) for the managers may be due to this reduced reliability.

3.1.4. Summary of the validation.

. The factor TO resulted in higher values only for the operators performing the high

LIP condition. The values found for operators performing the low LIP condition are

the opposite of what was expected: the high conditions had a lower TO than the low

conditions. The data for the managers are all in the expected direction: the values of

all high conditions are higher than the low conditions. However, on six of the eight

high–low comparisons, the difference in TO is very small (55.3%), and so the

manipulation of TO did not result in substantial differences.

. For all operators and managers large differences were measured between the low and

high conditions of TSS. Thus ‘task-set switches’ was a valid experimental factor.

. For the operators, higher complexity ratings were measured on all high LIP

conditions. For the managers, only one line is in the non-expected direction (the high

TO, high TSS line). Overall we can conclude that there was a substantial difference

between the ‘low’ and ‘high’ levels of this factor in the expected direction.

3.2. SME and performance

The data points are plotted in 3D graphs to give an overview of the results in a similar

way as the 3D task load model. Each figure shows the actual measured values

of an independent variable for each condition (i.e. predicted low–high levels of

Figure 6. Task complexity ratings.

Cognitive task load in a ship control centre 1253

the independent variables). The results for each factor of the CTL model are described

separately. Four conditions can be compared for the operators and four for the

managers, giving a total of eight comparisons. Conditions 1 and 2, 3 and 4, 5 and 6, and

7 and 8 are compared for the factor TO, conditions 1 and 3, 2 and 4, 5 and 7, and 6 and 8

are compared for the factor TSS, and conditions 1 and 5, 2 and 6, 3 and 7, and 4 and 8 are

compared for the factor LIP. Each 3D graph is accompanied by a table showing the

average values and the positive or negative direction of each comparison. Section 3.1

shows that the manipulation of TO and LIP for the managers failed. Obviously, this has

consequences for the results on CTL and performance, and should be taken into account

in the interpretation.

3.2.1. Subjective mental effort ratings. Participants rated their effort expenditure each

minute during the replay of the video. The results are presented in figure 7 and table 3.

Time occupied:

. for the operators, all high TO conditions resulted in higher SME;

. for the managers, only condition 8 shows a lower SME score than condition 7;

. the average increase in SME in the high TO conditions was rather small (0.5).

Task-set switches:

. for the operators, three of the high TSS conditions had higher SME ratings, and the

fourth condition had a lower value; condition 3 showed lower values than condition 1;

Figure 7. Average subjective mental effort ratings of the operators and managers for each

condition.

Table 3. Subjective mental effort: average differences between the high and low conditions andnumber changes in the positive direction for each factor of the CTL model.

Average change Changes in positive direction

Operator Manager Total Operator Manager Total

TO 0.9 0.1 0.5 4/4 3/4 7/8

TSS 0.9 0.7 0.8 3/4 2/4 5/8

LIP 2.2 0.1 1.2 4/4 2/4 6/8

1254 M. Grootjen et al.

. for the managers, two high TSS conditions showed higher SME ratings, and two

showed a lower rating; condition 7 is a little lower than condition 5 and condition 8 is

lower than condition 6;

. the average increase in SME in the high TSS condition was 0.8;

. it should be noted the teams performing the low TSS conditions were different from

the teams performing the high TSS conditions; therefore the comparisons for the TSS

are less reliable.

Level of information processing:

. for the operators, all high LIP conditions resulted in higher SME;

. for the managers, two high LIP conditions showed a higher SME, and two showed a

lower SME; condition 7 was lower than condition 3 and condition 8 was lower than

condition 4;

. the average increase in the high LIP conditions was 1.2, which was much higher than

for the factors TO and TSS; this increase in SME is completely due to the operators;

. it should be noted that the data from the managers in conditions 7 and 8 are from two

teams only, whereas the data from conditions 3 and 4 are from three teams; therefore

comparisons of 3 and 7 and of 4 and 8 are less reliable;

. the teams that performed conditions 3 and 4 were different from those performing

conditions 7 and 8, which also makes the comparisons less reliable.

3.2.2. Performance-relative action time. Figure 8 and table 4 present the relative action

time, i.e. the time that the participants were actively involved in performing actions

relative to the time that the experts needed for these actions.

Time occupied:

. for the operators, only one high TO condition had a higher relative action time;

condition 4 has a higher action time than condition 3;

. for the managers, the only high TO condition with a higher relative action time is 8,

compared with condition 7;

. thus six of the eight comparisons showed higher relative action times for low TO

conditions: the average difference between the low and high condition was high (26%).

Figure 8. Relative action time for each condition.

Cognitive task load in a ship control centre 1255

Task-set switches:

. TSS did not show a consistent pattern of results (average change of 7%);

. for the operators, two high TSS conditions showed a higher relative action time;

. for the managers, only one high TSS condition showed a higher relative action

time.

Level of information processing:

. for the operators, three high LIP conditions showed higher relative action times; only

condition 6 showed a lower relative action time than condition 2;

. for the managers, all high LIP conditions showed higher relative action times;

. the high LIP conditions resulted in a substantial increase in relative action time; the

average difference between the low and high conditions was 38%.

3.2.3. Performance: expert ratings. Figure 9 and table 5 present the performance ratings

provided by experts.

Time occupied:

. the operators scored lower expert ratings on three high TO conditions; only condition

2 scored higher than condition 1;

Table 4. Relative action time (%): average differences between the high and low conditionsand number changes in the positive direction for each factor of the CTL model.

Average change Changes in positive direction

Operator Manager Total Operator Manager Total

TO 734 718 726 1/4 1/4 2/8

TSS 713 72 77 2/4 1/4 3/8

LIP 23 38 31 3/4 4/4 7/8

Figure 9. Average performance scores (ratings provided by experts) for each condition.

1256 M. Grootjen et al.

. the managers scored lower expert ratings on two high TO conditions; condition

2 scored higher than condition 1 and condition 8 scored higher than 7;

. the average reduction in expert ratings for the high TO conditions was 0.6 points.

Task-set switches:

. the operators scored lower expert ratings on three high TSS conditions; only

condition 3 scored higher than condition 1;

. the managers scored lower expert ratings on three high TSS conditions; only

condition 3 scored higher than condition 1;

. the average reduction on the high TSS conditions was 0.9 points, which was mainly

due to the reduced performance of the managers (1.5 points).

Level of information processing:

. the operators scored lower expert ratings on three high LIP conditions; only condition

5 scored higher than condition 1;

. the managers scored lower expert ratings on two high LIP conditions; condition

5 scored higher than condition 1 and condition 8 scored higher than condition 7;

. the average reductions in performance score was only 0.2 points.

3.2.4. Summary of the SME and performance section.

. LIP had the greatest impact on the SME ratings, especially for the operators.

. The participants needed more time to perform the required actions than the

experts. A substantial additional increase was found in the high LIP conditions,

whereas in the high TO and high TSS conditions the participants needed relatively

less extra time.

. On six of the eight low TO conditions the participants had a higher relative action

time (with an average of 26%) compared with the high TO conditions. Apparently the

participants used extra time in low TO conditions.

4. Discussion

The design and results of the experiment have been reported in the previous two sections.

We applied the CTL method and implemented the scenarios in the RNlN simulator.

The CTLmethod appears to be very useful for predicting the ‘location’ of a scenario in the

task–load space. However, the following points of discussion arose during the research.

Table 5. Performance: average differences between the high and low conditions and numberchanges in the negative direction for each factor of the CTL model.

Average change Changes in negative direction

Operator Manager Total Operator Manager Total

TO 70.3 70.9 70.6 3/4 2/4 5/8

TSS 70.2 71.5 70.9 3/4 3/4 6/8

LIP 70.4 0.1 70.2 3/4 2/4 5/8

Cognitive task load in a ship control centre 1257

4.1. TO, TSS, and LIP

Manipulation of TO was not successful. In particular, the low TO conditions were very

hard to manipulate. As we did not have any problems with this in earlier more controlled

experiments, this appears to be attributable to the complexity of this experiment.

However, the results of the experts on TO are much more in line with our expectations

(figure 10). Only the manipulation of conditions 3 and 4 failed. Based on the expert data

of figure 10, we would expect an even larger effect on TO for the participants because of

the effect of TSS and LIP on TO. An explanation can be found in the relative action time.

The participants used relatively more time (compared with the experts) in the low TO

conditions than in the high TO conditions (figure 8). This could be because they realized

that they had more time available, because there was little or no time pressure, and made

use of it.

The manipulation of TSS worked quite well. Small differences between the low and

high TSS conditions were found on SME (i.e. an increase of 0.8 for the managers);

however, performance degraded on high TSS conditions, especially for the managers

(expert ratings decreased by 1.5). As was also found by Neerincx et al. (2000) and

Neerincx and Passenier (2000), high TSS is a critical factor in current and new ships of

the Royal Netherlands Navy. As can be seen in figure 11 and table 6, in most conditions

the participants switch much more than in the optimal strategy folowed by the experts.

Based on these data, a scheduler to help the operator determine an efficient strategy,

as described by Grootjen et al. (2002), seems necessary to keep the performance at

acceptable levels.

The manipulation of LIP worked well. The SME ratings for high LIP increased, the

expert performance ratings were slightly lower, and the relative action time strongly

increased at high LIP conditions. This large effect of LIP was also found in earlier

research (Grootjen et al. 2002, Neerincx and van Besouw 2001, Neerincx et al. 2003a).

An effective way of supporting highly complex situations was described in Neerincx and

Lindenberg (2000), where the use of a diagnostic guide and a rule provider reduced the

complexity and kept the performance at the desired level.

Neerincx (2003) distinguished several critical regions (underload, vigilance, cognitive

lock-up, and overload). Figure 1 gives an overview of these regions. Obviously we are

Figure 10. TO for the experts.

1258 M. Grootjen et al.

interested in finding these critical regions in the data for the current experiment, i.e. for

corners 1 (underload), 2 (vigilance), 4 (cognitive lock-up), and 8 (overload and cognitive

lock-up).

Corner 1 has a low SME score (average 2.6), a low performance score on the expert

ratings (average 6.0) and a very high relative action time (154%). A possible explanation

for these low scores could indeed be underload of the participants.

In contrast with what should be expected when vigilance appeared, corner 2 has a high

performance score (average 8.3). SME was low (average 3.1). Apparently managers and

operators like to work in this condition, with no difficult tasks and almost no switches

between tasks. They used extra time to achieve this high score (average relative action

time 133%); the relative action time is higher in this condition than in the high TSS and

high LIP conditions (corners 4 and 6). Vigilance is a well-known problem which appears

when operators have to monitor tasks continuously or when boredom arises in highly

repetitive tasks (Levine et al. 1973, Parasuraman 1986). Neither of these causes appeared

in our scenario; scenarios took only 15 minutes and were too short for vigilance problems

to appear.

No evidence of cognitive lock-up was found in corners 4 and 8. This could be partially

because we did not specifically evaluate the scenarios on cognitive lock-up. However, the

expert ratings decreased when the participants switched too late to a problem with higher

priority, and so serious cognitive lock-up would have been found in the expert ratings.

The operators in corner 8 appeared to be overloaded. Compared with the experts, the

operators made 333% switches between tasks, with an SME of 6.8, a relative action

time of 123%, and expert ratings of 6.0. The high TSS, combined with a TO that is almost

Figure 11. TSS for the participants relative to TSS for the experts (%).

Table 6. TSS data of the participants, experts, and the relative TSS (%).

Operator Manager

Condition 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

TSS

participants

2 2 5 11 4 2 9 13 2 2 17 16 6 4 26 32

TSS experts 1 2 2 9 3 2 10 4 1 2 13 8 3 4 14 18

Relative

TSS (%)

150 100 250 126 117 100 93 333 150 100 128 196 192 88 186 178

Cognitive task load in a ship control centre 1259

the same as the TO of the experts (95%) (figure 12), showed that they did not know what

to do. The signs of overload are less obvious for the managers in corner 8; however,

they have a high relative action time (182%) and their relative TSS is 178%. In contrast

with the operators, the managers appear to take more time to perform their tasks at

high CTL.

4.2. Experimental method

In the original experimental set-up, the participants had to indicate the start and finish of

all their actions after the scenario. As stated in section 2.3, the participants were unable to

do this in the current set-up because there were too many actions to indicate. The data

that were collected from this evaluation were not used. Instead of the participants

evaluating the action times, this was done by two experts. The experts sometimes found it

difficult to ascertain whether an operator or a manager was performing an action, and of

course they could only evaluate the visible interactions with the system. Therefore some

actions that were quite demanding (i.e. thinking about complex problems) were hard to

evaluate. For example, monitoring the system has not been noted as action. This had

consequences for the evaluation of the operator in the fire scenario (corners 3 and 4)

whose main task was monitoring the smoke boundaries.

After each scenario the participants had to rate its complexity and effort. Each minute,

first complexity and immediately afterwards mental effort had to be scored. As can be

seen in table 7, the values are highly correlated. Despite the thorough explanation, it is

possible that the participants found it difficult to distinguish between complexity and

effort and scored similar values.

Figure 12. TO for the participants relative to the TO for the experts.

Table 7. Correlation between complexity and effort ratings.

Operator Manager

Condition 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

Complexity rating 2.4 2.5 1.5 2.1 3.7 3.9 4.1 6 2.5 3.1 4.2 5 4.7 4.7 4.6 3.6

Effort rating 2.8 3.1 2.2 3.3 3.9 4.3 5.2 6.8 2.4 3.0 4.9 5.1 4.3 4.5 4 3.1

Correlation 0.97 0.91

1260 M. Grootjen et al.

The variation in subjective ratings between participants can be high because

participants use different baselines. Complexity ratings do not provide absolute values

but are relative to the individual’s baseline. As long as the same subject provides ratings

for all the conditions to be analysed, this does not pose a problem. However, if conditions

containing ratings from different participants are compared, the comparison becomes less

reliable.

As explained in section 2.9, it was not possible for both the operator and the manager

to be in corner 8 simultaneously. Therefore we switched conditions 7 and 8 for the

manager. During the experiments it became clear that it was not possible to switch

conditions 7 and 8 for the operator and the manager entirely independently of each other;

the managers (in condition 7) helped the operators (in condition 8). This could be an

explanation for the low performance and high SME score for the managers in

condition 7.

After the experiments an expert from the RNlN gave his opinion of the poor

performance of the M-officer in general. Compared with the (propulsion) operator, the

M-officer spends much less time in the SCC, in fact only in a high-readiness state.

However, an operators spends many hours in the SCC during his career, not only in high-

readiness states, but also in normal circumstances.

By performing and analysing the scenarios, the participants learnt about their mistakes

during the experiments. This could have been of particular benefit to the managers of

the MBD scenarios (corners 1, 2, 5, and 6 of the cube) who came back as operators in

the battlestation scenarios (corners 7 and 8). The scenarios differ from each other, but

some actions could have been repeated in conditions 7 and 8.

The operators who participated in the fire scenarios (corners 3 and 4) differed in rank,

education and experience. The reason for this is that the person doing this task is not

prescribed. Accordingly, one ship sent a petty-officer and another sent a sailor to act in

this function, and this probably influenced the results.

Some conditions are not very common in certain domains. For example in our

experiment, it is difficult to produce a scenario with high TSS and low TO. This problem

was partially overcome by choosing different types of scenarios, but some combinations

of load factor are still hard to construct.

Although we found signs of underload, the experimetal set-up appeared to be

mainly concentrated on overload. During the experiment we noticed that a substantial

increase in scenario time will be needed to detect further underload and vigilance

problems, and that operators are not inclined to ‘do nothing’ in a simulator that is

being used for training. However, the importance of underload has already been

demonstrated in some other domains. For example, Young and Stanton (2002) show

that automation in an automobile can lead to a substantial reduction in mental

workload. In the case of an automation failure, driver errors could occur. In a subsequent

experiment, we will study underload in more detail during crew operations on an

ADCF at sea.

5. Conclusions and recommendations

5.1. Conclusions

This experiment provided further empirical support for the CTL model and method, and

gives an initial estimation of the critical load values for the SCC. In general we can draw

the following conclusions.

Cognitive task load in a ship control centre 1261

1. The CTL method provides a good prediction of the task load on TSS and LIP that

will actually appear in the SCC simulator. The manipulation of LIP and TSS was

successful; no substantial differences were found on TO.

2. High levels of TSS and LIP resulted in a reduced performance and increased SME.

A reduction in performance of expert ratings was found for all high conditions; the

largest reduction was found for managers in the high TSS condition. The largest

effects on SME were found for the operators, especially at high LIP. Because the

manipulation of TO failed, little or no effect on performance and SME could be

found. In corner 1 the participants scored low on performance and underload was

detected. Corner 8 showed signs of overload, especially for the operators. They

scored low on performance and high on SME. No signs of vigilance problems and

cognitive lock-up could be found.

This experiment was conducted in a realistic complex high-demand environment.

Because the evaluation was performed with real end-users the number of participants was

limited, and so no statistical analysis could be performed. However, similar results were

found in earlier more controlled experiments with more participants (the statistical

analyses of these experiments proved that their results were significant). For example,

Neerincx and van Besouw (2001), Grootjen et al. (2002), and Neerincx et al. (2003a)

determined high values of LIP and TSS. Neerincx and Griffioen (1996) identified a critical

overload and an adequate load area for LIP and TO, similar to that found in the current

experiment, for operators in the railway traffic control centre of the Netherlands

Railways.

5.2. Future research

Obviously, the ultimate goal is to keep the operator in the optimal CTL space of figure 1.

By using a scenario-based design (Carroll 2000) and the CTL method and model, we are

now able to predict CTL in the design process of new systems. Tools such as task

allocation (Neerincx et al. 2003b) and interface support can be used to keep the operator

in an optimal load space during design. Grootjen et al. (2002) validated an interface

concept which was specifically designed to support the operator on the CTL load factors.

Unfortunately, not everything can be foreseen in the design process. Because of this,

another set of tools is needed to keep the CTL at an optimum level. Alty (2003) suggested

the development of adaptive systems as a possible way forward for decreasing cognitive

workload, particularly in the control of large dynamic systems. Two examples are given

below.

The first example is adaptive interface support. The level of automation and the

amount of information supply can be altered using an adaptive interface. A second

example is task allocation. Task allocation refers to the process of redistributing tasks

amongst actors, with the overall goal of improving overall system performance (Endsley

and Kaber 1999). Such redistribution is usually a response to a change in either

situational factors or actor characteristics. In response to a sudden increase in workload,

a task could shift from operator control to system control. These examples of cognitive

support should depend on the operator state (e.g. physiological measures) and task

demands (CTL model). Subsequently, information about context and the technical

system are needed to form an adequate adaptation mechanism.

One of the major issues in this form of automation is whether the system or the

operator should initiate the change. van der Kruit (2004) explicitly distinguishes two

1262 M. Grootjen et al.

different forms of automation. The first is adaptable automation, which occurs when the

control of a task shifts from the operator to the system, or vice versa, initiated by the

operator and because of operator-perceived changes in the state of the world. The second

is adaptive automation which occurs when the control of a task shifts from the operator

to the system or vice versa, initiated by the system, and because of system-perceived

changes in the state of the world. Current research suggests the possibilities of a third

form of automation, in which the change is determined by operator and system together.

Obviously, research about adaptive (or adaptable) support is still in its infancy.

Much interesting and challenging human factors issues for real-time application are

still undiscovered. The CTL model seems to provide a good basis to develop such

automation.

References

ALTY, J.L., 2003, Cognitive workload and adaptive systems. In Handbook of Cognitive Task Design, E. Hollnagel

(Ed.), pp. 129–146 (Mahwah, NJ: Erlbaum).

BEEVIS, D., 1992, Analysis techniques for man–machine systems design. NATO/Panel 8-RSG.14, Technical

Report AC/243(Panel 8)TR/7, North Atlantic Treaty Organization, Brussels.

BOEHNE, D.M. and PAESE, P.W., 2000, Deciding whether to complete or terminate an unfinished project: a strong

test of the project completion hypothesis. Organizational Behavior and Human Decision Processes, 2, 178–194.

CARROLL, J.M., 2000, Making Use: Scenario-Based Design of Human–Computer Interactions (Cambridge, MA:

MIT Press).

ENDSLEY, M.R. and KABER, D.B., 1999, Level of automation effects on performance, situation awareness and

workload in a dynamic control task. Ergonomics, 42, 462–492.

GROOTJEN, M., NEERINCX, M.A. and PASSENIER, P.O., 2002, Cognitive task load and support on a ship’s bridge:

design and evaluation of a prototype user interface. In Proceedings INEC 2002, pp. 198–207.

HELANDER, M., LANDAUER, T.K. and PRABHU, P.V., 1997, Handbook of Human–Computer Interaction, 2nd edn

(Amsterdam: North-Holland).

HOLLNAGEL, E., 2003, Handbook of Cognitive Task Design (Mahwah, NJ: Erlbaum).

JACKO, J.A. and SEARS, A., 2003, The Human–Computer Interaction Handbook: Fundamentals, Evolving

Technologies and Emerging Applications (Mahwah, NJ: Erlbaum).

KERSTHOLT, J.H. and PASSENIER, P.O., 2000 Fault management in supervisory control: the effect of false alarms

and support. Ergonomics, 43, 1371–1387.

KIRWAN, B. and AINSWORTH, L.K., 1992, A Guide to Task Analysis (London: Taylor & Francis).

KOUBEK, R.J., BENYSH, D., BUCK, M., HARVEY, C.M. and REYNOLDS, M., 2003, The development of a theoretical

framework and design tool for process usability assessment. Ergonomics, 46, 220–241.

LEVINE, J.M., ROMASHKO, T. and FLEISHMAN, E.A., 1973, Evaluation of an abilities classification system for

integrating and generalizing human performance research findings: an application to vigilance tasks. Journal

of Applied Psychology, 58, 149–157.

MAGUIRE, M., 2001, Methods to support human-centred design. International Journal of Human–Computer

Studies, 55, 587–634.

MAYHEW, D.J., 1999, The Usability Engineering Lifecycle: A Practitioner’s Handbook for User Interface Design

(San Francisco, CA: Morgan Kaufman).

NEERINCX, M.A., 2003, Cognitive task load design: model, methods and examples. In Handbook of Cognitive

Task Design, E. Hollnagel (Ed.), pp. 283–305 (Mahwah, NJ: Erlbaum).

NEERINCX, M.A. and VAN BESOUW, N.J.P., 2001, Cognitive task load: a function of time occupied, level of

information processing and task-set switches. Industrial Ergonomics, HCI, and Applied Cognitive Psychology,

6, 247–254.

NEERINCX, M.A. and GRIFFIOEN, E., 1996, Cognitive task analysis: harmonizing task to human capacity.

Ergonomics, 39, 543–561.

NEERINCX, M.A. and LINDENBERG, J., 2000, DISCII: Design of interface support for task-set switching and

integration, Report TM-00-D006, TNO Human Factors Research Institute, Soesterberg, The Netherlands.

NEERINCX, M.A. and PASSENIER, P.O., 2000, Cognitive engineering for the ADCF Integrated Monitoring and

Control System, Report TM-00-A031/E, TNO Human Factors Research Institute, Soesterberg,

The Netherlands.

Cognitive task load in a ship control centre 1263

NEERINCX, M.A., VAN DOORNE, H. and RUIJSENDAAL, M., 2000, Attuning computer-supported work to human

knowledge and processing capacities in ship control centres. In Cognitive Task Analysis, J.M.C. Schraagen,

S.E. Chipman and V.L. Shalin (Eds.), pp. 341–362 (Mahwah, NJ: Erlbaum).

NEERINCX, M.A., VAN BESOUW, N.J.P. and STREEFKERK, J.W., 2003a, Cognitive task load: memory capacity and

switch costs. Report, TNO Human Factors Research Institute, Soesterberg, The Netherlands.

NEERINCX, M.A., VAN DER DOBBELSTEEN, G.J.H., GROOTJEN, M. and VAN VEENENDAAL, J., 2003b, Assessing

cognitive load distributions for envisioned task allocations and support functions. In 13th International SCSS

(CDROM) Orlando.

NEERINCX, M.A., RYPKEMA, J.A. and PASSENIER, P.O., 2003c, Cognitive and functional (COLFUN) framework

for envisioning and assessing high-demand situations. In 9th European Symposium on Cognitive Science

Approaches to Process Control, pp. 11–16 (Amsterdam, The Netherlands: EACI Conference Series).

PARASURAMAN, R., 1986, Vigilance, monitoring, and search. In Handbook of Perception and Human Performance,

Vol. 2, Cognitive Processes and Performance, KR. Boff, L. Kaufman and J.P. Thomas (Eds.), pp. 1 – 39

(New York: John Wiley).

RASMUSSEN, J., 1986, Information Processing and Human–Machine Interaction: An Approach to Cognitive

Engineering (Amsterdam: Elsevier).

ROSSON, M.B. and CARROLL, J.M., 2001, Usability Engineering: Scenario-based Development of Human–Computer

Interaction (San Francisco: Morgan Kaufman).

RYPKEMA, J.A., RUIJSENDAAL, M., VAN DER BOSCH, K., SCHRAAGEN, J.M.C. and HOLEWIJN, M., 2002, Mental load

operator Westerscheldetunnel, Report TNO TM-02-C039, TNO Human Factors Research Institute,

Soesterberg, The Netherlands.

SCHRAAGEN, J.M.C., CHIPMAN, S.E. and SHALIN, V.L. (Eds.), 2000, Cognitive Task Analysis (Mahwah, NJ:

Erlbaum).

VAN DER KRUIT, V.L.I., 2004, Adaptive automation in command and control decision making. Report,

TNO Human Factors Research Institute, Soesterberg, The Netherlands.

VAN VEENENDAAL, J., 2002, Mental load of officer of the watch: an explorative study on the possibilities to add

tasks and computer support on the bridge, Royal Netherlands Navy Institute.

VELTMAN, J.A. and GAILLARD, A.W.K., 1999, Mental workload of the tactical co-ordinator of the Lynx helicopter,

Report TM A-036, TNO Human Factors Research Institute, Soesterberg, The Netherlands.

VELTMAN, J.A. and GAILLARD, A.W.K., 1996, Pilot workload evaluated with subjective and physiological

measures. In Aging and Human Factors, K. Brookhuis, C. Weikert, J. Moraal and D. de Waard (Eds.),

pp. 107–128.

WICKENS, C.D., 1992, Engineering Psychology and Human Performance, 2nd edn (New York: Harper-Collins).

YOUNG, M.S. and STANTON, N.A., 2002, Malleable attentional resources theory: a new explanation for the effects

of mental underload on performance. Human Factors, 44, 365–375.

ZIJLSTRA, F.R.H., 1993, Efficiency in Work Behaviour (Delft, The Netherlands: Delft University Press).

1264 M. Grootjen et al.