Designing model hypermedia applications

10
38 Designing Modal Hypermedia Applications Franca Garzotto § , Luca Mainetti § , Paolo Paolini §,# § HOC-Hypermedia Open Center Politecnico di Milano—Italy # Telemedia Lab, University of Lecce—Italy E-mail: {garzotto, mainetti, paolini}@elet.polimi.it ABSTRACT Different users of a hypermedia application may require different combinations of modes, i.e., different ways of perceiving the content or different ways of interaction. Multimodality—intended as the coexistence of multiple combinations of modes in the same application—can improve application richness and can accommodate the needs of different categories of users. On the other hand, multimodality increases complexity and may affect usability, since a variety of different interaction styles may be disorienting for the users. Designing an effective multimode hypermedia is a difficult problem. This paper discusses this issue, presenting a taxonomy of different kinds of modes in hypermedia applications and introducing the concept of modal hypermedia interaction. Modal interaction means that the semantics of normal application commands are dependent not only on the application state, as usual, but also on mode setting. We introduce a formal model for modal hypermedia interaction that helps us to analyse more precisely design alternatives and their impact on usability. We illustrate our approach by examples from a museum hypermedia called “Polyptych” that we actually built. KEYWORDS: modal interaction, usability, hypermedia application design, hypermedia models 1 INTRODUCTION The term “mode” has been loaded in literature with a variety of meanings [7, 9, 15]. In the context of hypermedia applications, we can use the term for two different broad categories of meaning: communication mode and interaction mode. A communication mode denotes a “carrier of information” [9], i.e., the way used to convey the content of the application to the reader. An interaction mode denotes the way users interact with the application and utilize it. A complex hypermedia application is naturally multimode for both the above aspects: the content is conveyed through various combinations of media, languages, rhetorical styles, presentation metaphors, and several interaction paradigmas are available, related to different styles of information access, e.g., search and navigation [5], and different ways of operating on media of different nature. Some combinations of modes are more appropriate for some user profiles but are unsatisfactory for others [1]. The proper choice of modes, for a given category of users, depends on a number of different factors: the user expertise with the application domain, his knowledge of computers in general and of hypermedia in particular, the goal he is trying to achieve, the time available for the session, the context of use, the evolution of the current session, etc. If a hypermedia aims to address several types of users and tasks, a multimode system is more appropriate than a mono- mode application (based on a single combination of modes) or a set of mono-mode applications. A mono-mode application is often a crude compromise among different user needs, none of them being fully satisfied. On the contrary, different combinations of modes within a single application can accommodate different categories of user requirements and can support a variety of tasks in different situations. Moreover, a user can switch between mode combinations reasonably more seamlessly than in a set of mono-mode applications. Unfortunately, the co-existence of several combinations of modes can affect usability since the user is faced with the additional complexity of selecting the proper mode combination or switching between different combinations. The aim of this paper is to discuss this problem and to identify crucial usability issues that should be addressed when designing a multimode hypermedia. Our proposal is modal interaction, as a technique combining richness of solutions (i.e., availability of multiple combinations of modes) with suability. Modal interaction is not a totally new concept, since several existing systems (word processors, for example) make already use of it, to a certain degree. The novelty of this paper is to specialize this concept for hypermedia. Our contributions are a taxonomy of hypermedia-specific modes, a model to precisely define the concept of modal hypermedia application, and an analysis of possible design trade-offs. Permission to make digital/hard copies of all or part of this material for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copy- right notice, the title of the publication and its date appear, and notice is given that copyright is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires specific permission and/or fee. Hypertext 97, Southampton UK © 1997 ACM 0-89791-866-5...$3.50

Transcript of Designing model hypermedia applications

38

Designing Modal Hypermedia Applications

Franca Garzotto §, Luca Mainetti §, Paolo Paolini §,#

§HOC-Hypermedia Open CenterPolitecnico di Milano—Italy

#Telemedia Lab, University of Lecce—ItalyE-mail: {garzotto, mainetti, paolini}@elet.polimi.it

ABSTRACTDifferent users of a hypermedia application may requiredifferent combinations of modes, i.e., different ways ofperceiving the content or different ways of interaction.Multimodality—intended as the coexistence of multiplecombinations of modes in the same application—canimprove application richness and can accommodate theneeds of different categories of users. On the other hand,multimodality increases complexity and may affect usability,since a variety of different interaction styles may bedisorienting for the users. Designing an effective multimodehypermedia is a difficult problem. This paper discusses thisissue, presenting a taxonomy of different kinds of modes inhypermedia applications and introducing the concept ofmodal hypermedia interaction. Modal interaction means thatthe semantics of normal application commands aredependent not only on the application state, as usual, butalso on mode setting. We introduce a formal model formodal hypermedia interaction that helps us to analyse moreprecisely design alternatives and their impact on usability.We illustrate our approach by examples from a museumhypermedia called “Polyptych” that we actually built.

KEYWORDS: modal interaction, usability, hypermediaapplication design, hypermedia models

1 INTRODUCTIONThe term “mode” has been loaded in literature with a varietyof meanings [7, 9, 15]. In the context of hypermediaapplications, we can use the term for two different broadcategories of meaning: communication mode and interactionmode. A communication mode denotes a “carrier ofinformation” [9], i.e., the way used to convey the content ofthe application to the reader. An interaction mode denotesthe way users interact with the application and utilize it.

A complex hypermedia application is naturally multimodefor both the above aspects: the content is conveyed throughvarious combinations of media, languages, rhetorical styles,presentation metaphors, and several interaction paradigmas

are available, related to different styles of informationaccess, e.g., search and navigation [5], and different ways ofoperating on media of different nature. Some combinationsof modes are more appropriate for some user profiles but areunsatisfactory for others [1]. The proper choice of modes,for a given category of users, depends on a number ofdifferent factors: the user expertise with the applicationdomain, his knowledge of computers in general and ofhypermedia in particular, the goal he is trying to achieve, thetime available for the session, the context of use, theevolution of the current session, etc.

If a hypermedia aims to address several types of users andtasks, a multimode system is more appropriate than a mono-mode application (based on a single combination of modes)or a set of mono-mode applications. A mono-modeapplication is often a crude compromise among differentuser needs, none of them being fully satisfied. On thecontrary, different combinations of modes within a singleapplication can accommodate different categories of userrequirements and can support a variety of tasks in differentsituations. Moreover, a user can switch between modecombinations reasonably more seamlessly than in a set ofmono-mode applications.

Unfortunately, the co-existence of several combinations ofmodes can affect usability since the user is faced with theadditional complexity of selecting the proper modecombination or switching between different combinations.The aim of this paper is to discuss this problem and toidentify crucial usability issues that should be addressedwhen designing a multimode hypermedia. Our proposal ismodal interaction, as a technique combining richness ofsolutions (i.e., availability of multiple combinations ofmodes) with suability.

Modal interaction is not a totally new concept, since severalexisting systems (word processors, for example) makealready use of it, to a certain degree. The novelty of thispaper is to specialize this concept for hypermedia. Ourcontributions are a taxonomy of hypermedia-specific modes,a model to precisely define the concept of modalhypermedia application, and an analysis of possible designtrade-offs.

Permission to make digital/hard copies of all or part of this material forpersonal or classroom use is granted without fee provided that the copiesare not made or distributed for profit or commercial advantage, the copy-right notice, the title of the publication and its date appear, and notice isgiven that copyright is by permission of the ACM, Inc. To copy otherwise,to republish, to post on servers or to redistribute to lists, requires specificpermission and/or fee.Hypertext 97, Southampton UK© 1997 ACM 0-89791-866-5...$3.50

39

Section 2 discusses different types of modes in the contextof hypermedia applications and introduces the concept ofmodal interaction. Section 3 precisely defines the notion ofmodal hypermedia and proposes a formal model to describeit. Section 4 discusses design options. Section 5 presents anumber of examples taken for a hypermedia applicationnamed “Polyptych” that we actually built. Section 5 drawsthe conclusions.

2 MODES AND MODAL INTERACTION IN HYPERMEDIAAPPLICATIONS

In hypermedia applications, we can identify several differenttypes of modes within the two broad categories of modesmentioned in the introduction—communication andinteraction. They are summarized in the following table1 anddiscussed in the rest of this section:

ModeCategory

Mode Type Example(s)

Communication Media Text, Animation+SoundRhetorical style ConciseLanguage EnglishSize Short guided tours (e.g., max.

10 steps)Interaction Topology Sequence, Tree

Control Couch PotatoAccess Navigation, Query

Media mode (communication): a medium or a combinationof different media characterize the way to convey content.Different media modes (e.g. text, image, animation, video,or “text plus audio”) could be used to convey the samecontent in different situations.

Rhetorical style mode (communication): within the samemedium (text or audio, for example) different styles could beused: concise vs. extensive, light vs. in-depth, expert vs.amateur, etc.

Language mode (communication): the choice of a specificforeign language can be considered as a very simplecommunication mode.

Size mode (communication): complex objects, such asguided tours [17], collections [3], or entities [2], can bedelivered in different sizes, according to different situations.A guided tour on a given subject, for example, could consistof five steps for a short version, and twenty steps for a longversion.

Topology mode (communication): different topologies canbe used to organize information on the same subject,according to different situations. Guided tours, for example,are very often structured as sequences of steps. We have

1This taxonomy does not pretend to be exhaustive, but it coversmost of the modes we have found in analyzing over 100commercial or research hypermedia applications.

experimented that sequences are easy to navigate butsometimes hide the real structure of information; we arecurrently experimenting (see section 5) the idea of providingdifferent topologies for the same guided tour, such as treesor lattices, to experienced readers, in order to representsemantic relationships among guided tour constituents, whileretaining the linear structuring for more naive readers.

Control mode (interaction): different degrees of controlcould be exercised over the application execution, rangingfrom the “couch-potato” mode to “full active control” mode.In the couch-potato mode, the user is mostly passive, doingalmost nothing or selecting simple choices. Full activecontrol means that the user has the total control overexecution.

Access mode (interaction): applications may range from a“Question&Answer” style of access (typical of data baseoriented or information retrieval oriented applications) to“Point&Click” browsing. The different access modes areoften intermixed; a typical combination is represented, forexample, by browsing over a collection of objects previouslyselected through a query.

When the application is multimode, i.e., severalcombinations of modes are needed within the samehypermedia, two extreme options are available. Onepossibility is to have a different interaction paradigm foreach mode setting. Another possibility is to support modalinteraction, i.e., to provide the same set of commands foreach mode combination but to alter their semanticsaccording to the current setting of modes. We will say that amultimode hypermedia is modal if it supports modalinteraction, and modeless otherwise.

Defining when a modal approach is more effective then amodeless approach is a design problem that, to ourknowledge, has been so far received limited attention. If theapplication is significantly complex, and the number ofmode combinations is large, the modeless “version” mayrequire the user to learn and to remember too manycommands (one set for each mode combination), thusviolating two fundamental usability factors: learnability andmemorability [12, 14]. In a modal hypermedia, a user mustlearn fewer commands, but he needs to understand how acommand's effect is dependent upon the current setting ofmodes; usability problems may arise if the application doesnot provide sufficient perceptual cues [11, 10] to help useridentify the current combination of modes. Furthermore, ifmode setting is under the user control, the user needs tolearn how to change mode configuration, which introducesadditional complexity.

It is outside the purpose of this paper to provide generalguidelines for designing the appropriate features of modalinteraction in each possible situation and for each possibleuser profile. Our goal is to identify design alternatives and to

40

suggest possible trade-offs. Before addressing these issues,we will first discuss modal interaction from a formal point ofview. The formalism—relatively simple—has the purpose tohelp defining some concepts in a more rigorous way andprovides the terminology to analyse more precisely variousdesign choices, discussed in section 4.

3 FORMAL DEFINITION OF MODAL HYPERMEDIAINTERACTION

Our model distinguishes between regular applicationcommands, which affect the execution state of theapplication, and mode commands, which affect the setting ofmodes only.

The “classic” (i.e., non modal) semantics of commands formodeless applications can be formally defined by thefollowing function:

(1) φ: Γ φ: Γ x Σ Σ →→ Σ; Σ; φ(γ,σ) φ(γ,σ) →→ σ σ'

In (1), Γ Γ is the universe of “normal” application commands(possibly with parameters), Σ Σ is the universe of possiblestates of execution for the application. φφ is the commandinterpretation function mapping the execution of a commandγγ, activated in an application state σσ, into the new state σσ' .The formal definition of “state of execution” is omitted inthis paper, since it is not relevant for the purpose of ourdiscussion2. We only assume that an execution state does notinclude the definition of mode settings in it. φφ is a partialfunction, since some interaction commands might not beavailable in some states of the application. In other words,φφ(γ,σ) is undefined if a command γ is not available in a stateσ.

To introduce modal interaction, we need to replace thedefinition (1) with a new one:

(1a) φφ' : Γ : Γ x Σ Σ x Μ Μ →→Σ;Σ; φ φ' (γ, σ, µ) (γ, σ, µ) →→ σ σ'

In (1a), Γ, Σ, σ, γ, and σ' are defined as for modelessapplications. Μ Μ is the set of mode combinations (alsocalled mode states or mode settings) available in theapplication, and µµ is a combination of modes in Μ Μ whichrepresents the setting of modes currently active in theapplication state σ. According to this definition, the samecommand, applied to two identical execution states, canhave different semantics, being dependent upon thecombination of currently active modes (i.e., µµ ).

We could have considered the set of modes as part of thedescription of the execution state of the application,reducing formula (1a) to (1); we believe, instead, that isconvenient to keep the description of the state of execution

2The reader is referred to [6, 8] for approaches modelling the stateof multimedia objects and to [5] for a definition of the notion ofnavigation state in modeless applications.

of an application distinct from the mode setting. One reasonis to keep a clean separation between normal applicationcommands, and commands altering the mode setting:definition (1a) makes it clear that normal commands do nothave side effects on modes, i.e., they do not modify implicitlythe current configuration of modes. Thus there is a uniquebehavior for each command executed in a given state and ina given setting of modes.

In order to modify the configuration of modes, we introducea new set of commands, the semantics of which is defined bythe following interpretation function:

((2)) ψψ: ΞΞ x Μ Μ x ΣΣ →→ Μ Μ x Σ;Σ;ψψ((ξξ, µ, σ) µ, σ) →→ (µµ', σ)σ)ψψ((ξξ0, −, σ) −, σ) →→ (µµδδ, σ)σ)

In (2), ΞΞ is the set of the mode setting commands, M is theset of mode states and ΣΣ is the set of execution states for theapplication as in (1a). The partial function ψψ maps eachmode command into its meaning; a mode command ξξ,applied to a mode state µµ , produces a new mode state µµ’without affecting the execution state σσ. The special mode

command ξξ0 is used to switch to a special mode state µµδδ andcan be considered as the mode re-setting command. In fact,

µµδδ is the standard, i.e., default, mode setting. We canassume that at any time during a session there is only onedefault mode setting, defined by the default assignmentfunction δδ.

4 DESIGN TRADE-OFFSThe formal definitions introduced in the previous section arepreliminary to the discussion about when and how the modecombinations should be set or modified in a modalhypermedia, and by whom. Different design options can beconsidered.

A) to assign the control of the mode settings to systemmanager only. Technically speaking, this means that themode setting commands (the ξξ‘s and ξξ0) as well as thedefault assignment function δ δ are not available within thenormal execution of the application, but to a “special” useronly. This solution can be valid in situations where the sameapplication must be deployed in different “versions” fordifferent purposes. A museum hypermedia, for example, canbe designed so that it is used as an information pointinstalled at the entrance of the museum, as a consultationpoint in the reading room, as a professional system in anoffice, as a CD-ROM usable at home, as a WWWapplication, etc. Instead of creating different applications,the same application could be used with a different standardconfiguration of modes in each different delivery context.Each deployed application will have one modality ofexecution (according to the default setting of modes definedby the system manager) possibly different for each

41

installation point. Each deployed application behaves as amodeless application, but the overall applicationenvironment is modal.

B) to let the user select the wished combination of modes atthe beginning of a session and modify the setting during thenormal execution of the application. Technically speaking,this means that the default assignment function δ δ and themode setting commands (the ξξ‘s and ξξ00) are alwaysavailable to the user within the normal execution of theapplication. This solution is very flexible, and can beappropriate when the same installation must be shared by alarge community of users, with different skills, roles, andtasks. The drawback of this solution is that it requires fromthe user the ability of selecting the proper setting of modesand of “tuning” the application to his needs.

C) to provide a subset of different initial mode settings foreach different version of the application; to allow the userchoose a mode configuration at the beginning of a sessionand switch to the default mode setting during the applicationexecution. This is a compromise between (a) and (b). Thedefault assignment function δ δ and the mode settingcommands (the ξξ‘s and ξξ00) are available to the systemmanager to configure each installation. Some mode settingcommands ξξ‘s are also available to the user to select a modeconfiguration different from the default one (usually set bythe system manager), but only at the beginning of a session.

In other words, ψψ((ξξ, µµ, , σσ))→→ (µµ', σσ) ) is defined for the useronly for a subset of ΜΜ and only if σ σ is the initial state σσ00. . Inaddition, at any time during a session the user can switch tothe default mode setting activating the re-setting command ξξ00

... This solution takes into account the different needs of

each application deployment, while at the same time itprovides a good degree of flexibility to the user3.

D) to support automatic adaptation of mode configuration.Technically speaking, this means that the mode settingcommands are automatically invoked by the system, undercertain circumstances. During the execution of theapplication, the system could establish the properconfiguration of modes depending upon the user profile, thepattern of usage, the task being accomplished, etc. [13, 16].This solution is the most ambitious and could appear veryattractive, but is also the most complex. In practice, it isseldom adopted, and if adopted, it is seldom fullysatisfactory. A simple example of automatic modeadaptation can be found in the “Louvre” multimedia CD-ROM (a French product by “Réunion des MuséesNationaux” and Monparnasse Multimédia), where thepresentation of a painting and some interaction modalitieschange if the reader has visited the same subject before. For 3A slightly more sophisticated version of this solution is to makealso the default assignment function available to the user, to allowhim to select a specific mode configuration as his own defaultsetting.

example, the first time the user visits a painting, he gets afull screen image and an audio comment; at the end of theaudio, the screen automatically changes to a “static”presentation showing a small size image of the samepainting, the painter's picture, and some navigation buttons.In any successive access to the same painting (within thesame session) the user gets the static presentation first. Fromhere, a navigation button allows the user to access the fullscreen image with the audio comment; differently from thefirst time, at the end of the audio nothing happens, and theuser must guess that, by clicking everywhere on the screen,he can return to the static presentation. The idea behind this“adaptive” behavior is probably that the static presentationand the full control on navigation are more appropriate if theuser is somehow “expert” on the subject domain, i.e., he hassome knowledge about the current painting since he haspreviously explored it. Still, in some usability experimentsthat we have run we tested that users were disoriented anddid not understood what was happening.

If solutions b) or c) are chosen, an additional design issueconcerns how the mode setting commands (the ξξ's and ξξ00)should be made available to the user. One possibility is toprovide explicit commands, easily at hand for the user. Thissolution may increase usability for the sophisticated user,but it may be disorienting for a not-so-expert person, whomight involuntarily modify the mode setting and getconfused by the change of behavior of the application. Toimprove usability, it is crucial that the application makes agood job of informing the user about the current mode he isin and how to enter the other modes.

Another solution is to provide the user with “hidden” modecommands, in the sense that only the expert user is informedof them off-line but an unaware user may never realise thatthey are available. This choice has the advantage ofretaining simplicity and safeness for the inexperienced user,still allowing higher control over the navigation style to theexperienced user. Some usability problems may result if, bychance, the unaware user discovers this trapdoor andactivate mode commands with no knowledge of their effects.Yet another solution is to allow mode setting commandsonly when the application is in a special execution state,difficult to reach (or reachable with a protected access only);here the lack of flexibility is balanced by the “safeness” ofthe solution.

In the next section, we will exemplify the concepts discussedso far by shortly describing modal interaction in thehypermedia application “Polyptych”.

5 EXAMPLES FROM THE HYPERMEDIA APPLICATION“POLYPTYCH”

“Polyptych” is a hypermedia presentation of a museumresearch concerning the “Agostinian Polyptych” by Piero

42

della Francesca4. It is currently installed at the Poldi Pezzolimuseum in Milano, within a “traditional” exhibition on Pierodella Francesca, and in the museum house in Tuscany(Borgo San Sepolcro) where Piero della Francesca was born.Polyptych will be also available as a CD-ROM edition bynext year.

“Polyptych” has been intended for a variety of users, thathave been classified in three major categories:

“casual” visitors: just passing through the information pointby chance For them the application is a “walk-up-and-use”system that is only intended to be used once, probably for ashort time. They might differ in their skill about computersand hypermedia technology;

“intentional” visitors: they have some knowledge, or, at least,a significant interest, about the subject domain and want tolearn more about it. They might differ in their knowledgeabout hypermedia technology and in the amount of timeavailable to explore the application;

“specialist” visitors: they are specialists in the applicationdomain, e.g., researchers in history of art; again, they mightdiffer in knowledge about technology and in the timeavailable to use the application.

Modal interaction in “Polyptych” has been designed so thatit can take into account the needs of these differentcategories of users, and the different situations of fruition(i.e. time availability, motivations to use the application,tasks to be accomplished, etc.)

In the rest of this section, we will shortly present the contentand the structure of the application, using the concepts ofHDM—the Hypermedia Design Model [2, 3, 4]. Thisdescription is preliminary to the discussion concerning thedesign of modal interaction in “Polyptych”.

5.1 Content structureThe overall content of “Polyptych” has been organized as aset of eight main structures called “paths” (corresponding toHDM “collections”): “Reconstruction”, “TechnicalAnalysis”, “Restoration”, “Fashion”, “Textiles”, “Jewelry”,

4The so called “Agostinian Polyptych” is one of the most matureworks by Piero della Francesca, one of the greatest artists of theItalian Renaissance. The various components of this polyptych—with the exception of the central panel, got lost—are currentlyexhibited in some important museums world-wide: FrickCollection in New York; National Gallery of Lisbon; NationalGallery of London; Poldi Pezzoli Museum in Milano. For years, ateam of art researchers have attempted to virtually “reconstruct”how the overall polyptych might have looked like. This research ispresented in our hypermedia application, proposing a number ofpossible reconstruction hypothesis. These are based on the analysisof previous restorations, the investigation of ancient documents,and the compared analysis of sculpture, textile, fashion, jewellery,every-day life, and religious life in the Renaissance.

“Archive Documents”, “Renaissance Art Related Works”.Each path corresponds to a research topic, and its contenthas been created by a different group of art researchers(from the Poldi Pezzoli, the Milanese “Academy of Brera”,the Florential “Opificio delle Pietre Dure”, the University ofLecce, the Library of Borgo San Sepolcro).

All paths have a similar organization, consisting of a shortIntroduction, and a set of “sections”. A section correspondsto a subtopic of the general subject of the path, and consistsof one node of type “Visual”, one node of type “Text” , andseveral nodes of type “Detail”. A “Visual” node consists ofa large image, a caption, and an audio comment. A “Text”node includes one or two columns of text, one or moreimages, and, sometimes, animations.

Nodes of type “Visual” and “Text” provide essential, non-specialist information about the section subject. A “Detail”node provides additional information on the section subject,have a structure very similar to of “Text nodes”, but therhetorical style is mainly for art experts.

In each path, the set of sections is organized according totwo topologies—sequence or lattice—to address differentuser categories. The lattice structure is intended to capturethe semantic relationships among the topics presented in agiven path, and is mainly intended for domain experts. Thelinear structure arrange the sections according to a suggested

Mode settings

ModeTypes

ModesAutomaticNaviga-tion

VisualNaviga-tion

Text-basedNavigation

ExpertNaviga-tion

Control Passive (CouchPotato)

X

Manual X X X

Size Short X X

Average X

Long X

Media Image +Audio

X X

Text + ( images +animation)

X X

Topo-logy

Sequence X X X

Lattice X

User ProfilesCasualvisitor—low skill

Casualvisitor—averageskill

Intentionalvisitor

Domainspecialist

Table 1: mode settings in “Polyptych”

43

sequence of reading. To each topology corresponds a node(“Index Node”, in the HDM terminology [2, 3, 4]), whichshows the path structure and allows the user to select asection and access it directly.

5.2 Mode settingsFour possible settings of modes are available in“Polyptych”, corresponding to different styles of navigation:Automatic Navigation, Text-based Navigation, VisualNavigation, and Expert Navigation. They are schematicallysummarized in table 1.

The rest of this subsection will describe the meaning of eachmode setting, using examples from the path “Jewelry”(“Ricami Metallici”). Figure 1 shows the node representingthe introduction to this path.

Figure 1: introduction node of path “Jewelry” (“RicamiMetallici”). The button “Entra” is associated to a modal

command.

The buttons “Avanti” (Forward) and “Indietro” (Backward)correspond to modeless commands, and allow to proceed tothe introduction of the “next” or “previous” path,respectively. The button “Entra” (standing for “Enter thepath”) allows to explore the various sections of the path.This button corresponds to a modal command, its effectsbeing dependent from on the current setting of modes. Table2 summarizes the semantics of this command in eachdifferent setting of modes, referring to the figures shownalong this section.

Automatic navigation. Under this setting of modes, the usercan browse automatically across the different sections of thecurrent path, visiting only nodes of type “Visual nodes”.This mode setting has been designed for totally novicecasual visitors, since it provides a quick overview of thecontent of each path and requires a minimum degree of usercontrol. An example of Visual node that has been accessedunder this mode setting is shown in Figure 2a.

Figure 2a: first visual node of path “Jewelry”.Mode setting = “Automatic Navigation”.

The application proceeds to the next section at the end of theaudio comment, without requiring any user interaction. Theuser may only change the path of interest or switch to modesetting “Visual Navigation” (see below) by selecting thebutton “Manuale” (“Manual”).

Visual navigation. This mode setting is intended for casualvisitors who have some hypermedia skill and want toexercise a certain degree of control upon the navigation. Themain difference with respect to “Automatic Navigation”mode setting is that the user must explicitly request thetransition from a section to the Next, or Previous (or First orLast sections). Figure 2b shows the same section of path“Jewelry” presented in Figure 2a, but now the application isunder “Visual Navigation” mode configuration. The nodeincludes the buttons to navigate across the path such as“Avanti” (Forward), and “Ultimo” (“Last”—to access thelast section of the current research path.)

Modesetting

Effect of modalcommand “Entra”

Navigation control on thedestination

AutomaticNavigation

display the node of type“visual” in fig. 2a,simultaneouslyactivating the audiocomment

At the end of the audiocomment, the next node inthe path (of type “Visual”)is displayed automatically

VisualNavigation

display the node of type“Visual” in fig. 2b,simultaneouslyactivating the audiocomment

At the end of the audiocomment, navigation alongthe path (across nodes oftype “Visual”) is underuser control

Text-basedNavigation

display the node of type“Text” in fig. 3

Navigation is under usercontrol; next and previousnodes are of type “Text”

ExpertNavigation

display the node of type“Text” in fig. 5

Similar navigation as inText based mode setting,but now additionalnavigation links areavailable

Table 2: effects of clicking the modal button “Entra” ineach different setting of nodes

44

Figure 2b: first visual node of path “Jewelry”Mode setting = “Visual Navigation”

The user may change the mode setting (switching to“Automatic”), or change the path of interest. In addition, byusing the button “Vai a ....” (Goto), he is allowed to accessthe Index node which shows the path structure (see figure 4).

Text-based navigation. This setting is intended for“intentional visitors”, since it provides a significant amountof information for each path and requires an activeparticipation of the user to explore such a content. The usercan access the different sections, via nodes of type “Text”.Figure 3 shows an example of a node accessed under modesetting “Text-based navigation”. The content (concerningthe same section of the path as in figures 2a and 2b) appearsin two columns of text and one image. The interaction isslightly more complex than in Visual Navigation modesetting, since some new commands are available. Some ofthem are associated to symbols embedded in the text.Textual notes, for example, can be displayed by selecting thenote symbols—numbers in round brackets- and animationscan be activated by selecting the symbol “@”. No command

is available to the user to control animations. The user maychange the mode setting switching to “Automatic” or“Visual” Navigation.

Expert navigation. This setting is appropriate for domainexperts, e.g., researchers in the application subject: theamount of content is larger and the representation ofinformation is more complex and richer than in the othermode settings, in terms of topology of the path (lattice),content (a section is now represented by a node of type Textnode, plus a number of nodes of type “Details”), rhetoricalstyle (more specialist). Richer structures and contentsimplies that also interaction is more complex, since newnavigation links and commands to control active media(animation) are now available.

Figure 5 shows the first node of the first section of path“Jewelry” under mode setting “Expert Navigation”. Ifcompared with figure 3, the text is the same but the image isnow the third frame of an animation; the right column of textis covered by a note commenting the animation; below the

Figure 4: index node showing the structure of path“Jewelry”. Mode setting = “Text-based Navigation”.

Figure 3: first text node of path “Jewelry”Mode setting = “Text-based Navigation”.

Figure 5: first text node of path “Jewelry”Mode setting = “Expert Navigation”.

45

frame caption there are now the buttons to control theanimation execution. In addition, we can notice some smallbook-like icons below the left column of text; they representnavigation commands to access nodes of type “Details” thatprovide further information on the current section. Thenature of these additional contents (not available in “Text-based” setting of modes) is strictly technical, and therhetorical style of their texts are appropriate for domainexperts. One of such “Details nodes” is shown in figure 6.

Figure 6: third “Details” node associated to the firsttext node of path “Jewelry”

Mode setting = “Expert Navigation”

Finally, some commands on the node in figure 5 have adifferent meaning with respect to the same commands underText-based mode (see figure 3). For example, the commandto access the Index node of the path now displays a latticestructure, shown in figure 7. We can compare figure 7 withfigure 4, which shows a linear topology and represents theIndex node of the same path under Text-based, Visual,Automatic Navigation settings of modes.

Figure 7: Index node showing the structure of path“Jewelry”. Mode setting = “Expert Navigation”

A lattice captures more information, at the expenses of acertain difficulty both in understanding the intendedmeaning of the structure and, above all, in using it. In fact, in“Expert Navigation” mode setting the commands to navigateacross the path are apparently the same, but their semanticsis substantially changed, since the path structure is not linearany more. The command associated to the button “Avanti”(Forward), for example, may not take the user directly to thenext section, in general, but it may provide a number of“next” options since several sections may semantically“follow” the current one in a lattice structure.

5.3 Mode configuration controlA final consideration concerns the way of controlling modeconfigurations in “Polyptych”. The application is deployedas several information points in the museum, and the systemmanager can choose different initial mode settings for eachinstallation. When the application is reset to the cover page(either manually or through a time-out mechanism) the modesetting defined at installation time is always restored.

Switching among Text-Based, Visual, or AutomaticNavigation mode settings is based upon explicitconfiguration functions and/or mode commands, that arerepresented by visible buttons displayed on nodes (seefigures 2a, 2b, and 3). Commands to switch to ExpertNavigation mode setting, instead, are somehow “hidden”, inthe sense that the unaware user may never realize that theyare available. In fact, the user can set the mode configurationto “Expert Navigation” only if he is placed on the Indexnode displaying the structure of the current path (see figure4) and uses the right button of the mouse to select thesection of interest. The effect of this voluntarily “obscure”command is to place the user on a Text node like the oneshown in figure 5, under Expert Navigation setting. Afterthis action, any further command different from a modechange command will be interpreted in the context of ExpertNavigation setting. The intention is that only specialists orstrongly motivated users will exploit the possibilities of thismode setting, while normal users will not be aware of it andwill not be confused by its intrinsic complexity.

5.4 DiscussionWe have done some evaluation studies about the usability of“Polyptych” and about the effectiveness of its modalinteraction design. Intensive user testing have involvedcasual visitors, or art teachers and students visiting theexhibition where the application is installed, or artresearchers from the institutions that collaborated in theproject. The analysis and interpretation of the test results arestill ongoing and will be discussed in details in a futurereport.

We can anticipate here that art specialists, once informedabout the “hidden” mechanism to set mode configuration to“Expert Navigation”, have used it quite extensively with agood degree of satisfaction, and they have also appreciated

46

the possibility of switching to other mode settings to explorethe application content in different ways. In particular, artteachers frequently switched from Expert to VisualNavigation when discussing, through the application, thevarious topics with their students. We have also noticed thatstudents, after starting the visit of the application underAutomatic Navigation mode, tended to switch to VisualNavigation after 3-6 minutes of use and to continue theexploration under this configuration (with short jumps back-and-forth to Text-based navigation). Finally, user testing hasshown that the mechanism to switch to Expert navigation issafe for naive users; during our observations, no user usedthe mouse right button and switched to Expert navigation bychance unless explicitly informed about this possibility.

6 CONCLUSIONSBy their very nature, hypermedia applications employ alarge number of different modes, of different types. Multiplemodes improve the quality and richness of the applications,accommodating the needs of different users, in differentsituations. Multiple combinations of modes, on the otherhand, increases complexity and may create usabilityproblems.

In this paper we propose modal interaction as a technique ofhypermedia interaction design that allows the contemporaryachievement of simplicity for users and flexibility of tuninginteraction styles to specific user needs.

Modal interaction is characterized by clean separationbetween normal application commands and mode settingcommands; normal commands affect the execution state ofthe application, while mode setting commands affect modeconfigurations only. The semantics of normal applicationcommands depends upon the execution state and the modesettings.

We have discussed an example of modal interaction, asimplemented in “Polyptych”, a hypermedia applicationdeveloped at HOC-Politecnico di Milano, in co-operationwith the Poldi Pezzoli Museum in Milano. In “Polyptych”,the design of the various modes and of the variousmechanisms of modal interaction have taken into accountthe needs of different types of users: casual visitors,intentional visitors that have some interest on the subjectdomain of the application, and experts, i.e., researchers inhistory of Art.

All the configurations of modes and modal navigationtechniques discussed for “Polyptych” have beenimplemented. The implementation is based upon anavigation engine that maintains a separate representation ofthe execution state from mode configuration, anddistinguishes among modeless command, mode controlcommands, and modal commands.

Our research is planned to continue along the followingdirections:

a) to define a richer sets of modes, with the proper effectsupon application commands; the most promising are sizemodes and topology modes

b) to improve the set of mode commands, for initial settingand modification of modes

c) to generalize the architecture of the current navigationengine, enlarging the flexibility of its modal navigationmechanisms

d) to improve the switching among the different modes,under the user control

e) to conclude the analysis of usability experiments tovalidate the effectiveness of the various design choicesof “Polyptych”.

ACKNOWLEDGEMENTSWe would like to thank all the members of the team of“Polyptych”, for their help and constant enthusiasm in thisproject. We are especially grateful to A. Mottola Molfino,A. Zanni, and A.Di Lorenzo from the Poldi Pezzoli, C.Frosinini and M. Bellucci from Opificio delle Pietre Dure inFlorence, L. Polcri from the Library of Borgo San Sepolcro,A. De Marchi from University of Lecce, G. Butazzi and M.Pinin Brambilla, G. Restano, F. Bolognesi, and M. Angelerifrom Politecnico di Milano, and the Image Processing groupat ITIM-CNR. We also would like to thank the many visitorsof the Poldi Pezzoli museum who contributed to test“Polyptych”. We also acknowledge the generouscontribution of EPSON-Italy for the hardware equipment.

REFERENCES1. Bearne M., Jones S., Bearne J. S-F. M., “Towards

Usability Guidelines for Multimedia Systems”, In Proc.ACM Multimedia ’95, S. Francisco (CA), Oct. 1995

2. Garzotto F., Paolini P., Schwabe D. “HDM—A ModelBased Approach to Hypermedia Application Design” InACM Trans. Inf. Syst., 11 (1), Jan. 1993

3. Garzotto F., L. Mainetti, P. Paolini “Adding MultimediaCollections to the Dexter Model”. In Proc. ACMECHT'94, Edinburgh (UK), Sept. 1994

4. Garzotto F., Mainetti L., Paolini P. “HypermediaDesign, Analysis, and Evaluation Issues”. In Comm.ACM, 38 (8), Aug. 1995

5. Garzotto F., Mainetti L., Paolini P. “Navigation inHypermedia Applications: Modeling and Semantics”. InJournal of Organizational Computing, 6 (3), 1996

6. Gibbs S., Breiteneder C., Tsichritzis D., “DataModeling of Time-Based Media”. In Proc. ACMSIGMOD, Minneapolis, May 1994

47

7. Hanne K., Bullinger H., “Multimodal Communication:Integrating Text and Gesture”, In Blatter M.M.,Dannenberg R.B. (eds.) Multimedia Interface Design,ACM Press, 1992

8. Hardman L., Bulterman D.C.A., Van Rossum G.,“Adding Time and Context to the Dexter Model”. InComm. ACM, 37 (2), Feb. 1994

9. Hill W., Wrobkewsky D., McCandless T., Cohen R.,“Architectural Qualities and Principles for Multimodaland Multimedia Interfaces”, in Blatter M.M.,Dannenberg R.B. (eds.) Multimedia Interface Design,ACM Press, 1992

10. Kahn P., “Global and Local Hypermedia Design in theEncyclopaedia Africana”. In Fraisse S., Garzotto F.,Isakowitz T. Nanard J, and Nanard M. (eds.)Hypermedia Design, Springer, 1996

11. Norman D., “Design Rules Based on Analyses ofHuman Errors”, In Comm. ACM, 26 (4),April 1983

12. Nielsen J., “Usability Engineering”, Academic Press,1993

13. Norcio A.F., Stanley J. “Adaptive Human-ComputerInterfaces: a Literature Survey and a Perspective”. InIEEE Trans. Systems, Man, and Cybernetics, 19 (2),March/April 1989

14. Preece J., “Human-Computer Interaction”, AddisonWesley, 1994

15. Rudnick A.I., Hauptmann A.G. “Multimodal Interactionin a Speeh System”. In Blatter M.M., Dannenberg R.B.(eds.) Multimedia Interface Design, ACM Press, 1992

16. Stotts P.D., Furuta R. “Dynamic Adaptation ofHypertext Structure”. In Proc. ACM Hypertext’91, S.Antonio (TX), Dec. 1991

17. Trigg R.H., “Guided Tours and Tabletops: Tools forCommunicating in a Hypertext Environment”. In ACMTrans. Inf. Syst. 6 (4), 1988