An Instructor’s Assistant for Team-Training in Dynamic Multi-agent Virtual Worlds

An Instructor's Assistant for Team-Training indynamic multi-agent virtual worlds.Stacy C. Marsella & W. Lewis JohnsonMay 6, 1998Longer version of paper to be presented at ITS-98AbstractTeams of people operating in highly dynamic, multi-agent environ-ments must learn to deal with rapid and unpredictable turns of events.Simulation-based training environments inhabited by synthetic agentscan be e�ective in providing realistic but safe settings in which to de-velop the skills these environments require. However such trainingenvironments present a problem for the instructor who must evaluateand control rapidly evolving training sessions. We address the instruc-tors' problem with a pedagogical agent called the PuppetMaster. ThePuppetMaster manages a network of spy agents that report on theactivities in the simulation in order to provide the instructor with aninterpretation and situation-speci�c analysis of student behavior. Theapproach used to model student teams is to structure the state spaceinto an abstract situation-based model of behavior that supports in-terpretation in the face of missing information about agent's actionsand goals.1 IntroductionVirtual worlds inhabited by synthetic agents are increasingly beingused for training and education (e.g., [12, 7, 16]). These virtual worldscan provide a highly engaging environment in which to develop skills1

that students can readily apply to real world tasks. In creating theseenvironments, considerable e�ort is often expended so that the simu-lated world more faithfully mirrors the real world in which the acquiredskill must be employed. This concern for the �delity of the trainingsimulation has a long standing tradition (e.g., [4]).Training teams of people to operate in dynamic multi-agent set-tings presents special challenges, however. The concern for �delityhas led to training environments that are in turn very dynamic. Tofaithfully capture the unpredictable multi-agent setting, simulationscan be inhabited by multitudes of synthetic agents which ideally ex-hibit the same kind of complex behaviors that human participantswould exhibit. For example, Distributed Interactive Simulation (DIS)training sessions involve teams of human students interacting withpotentially thousands of synthetic and human agents within a verydynamic battle�eld simulation (e.g., [16]).From an instructor's perspective, the use of very dynamic multi-agent virtual environments raises several concerns. Among these is theseemingly simple question of what the student teams are currently do-ing. In a complex simulated world, conditions in the simulation canbe di�cult to determine. For instance, information from any one stu-dent's perception of events may be unavailable, or an agent (human orsynthetic) may not be able to explain their own motives. Furthermore,abstracting from the behavior of individuals to the behavior and goalsof teams requires additional e�ort. Conversely, there may be too muchlow level information for the instructor to absorb and interpret. Andas the number of interacting agents grows, the di�culty increases.Accordingly, it may be di�cult to determine what teams are doingand why they are doing it. Furthermore, determining how well theyare doing and analyzing trends in their behavior are also di�cult. Thisis especially true in a world where interaction is not tightly scripted.Students may undergo unique experiences for which there is no setresponse against which they can be assessed.We address the instructor's problem with a synthetic pedagogi-cal agent, the PuppetMaster, which provides a high-level interpreta-tion and assessment of the students. As the teacher's intermediary,the PuppetMaster dynamically assigns intelligent probes that monitorstudent teams and synthetic agents in the simulation. The situationin the virtual world is then assessed from the perspective of high leveltraining objectives. The assessment includes events, trends and ag-2

gregate reporting over multiple entities (e.g., for revealing teamwork).The resulting evaluation is then used to compose 2-D and 3-D pre-sentations that reduce the instructor's e�ort during the exercise andassists the instructor's review with students after the exercise.In a very dynamic simulation, student teams are often learning tooperate in both goal-directed and reactive fashions. Therefore, it isparticularly critical that the PuppetMaster appropriately model thestudents' loosely scripted interactions with the world, in the face ofchanging goals, partial information and irrelevant information. Thispresents a problem for approaches to pedagogical agents and intelli-gent tutoring that attempt to model students based on detailed planmatching (e.g., [12]), model tracing (e.g., [1]) or plan recognition (e.g.,[5]) based on a step-by-step analysis of actions. Arguably, such ap-proaches, even if plausible for the multi-agent dynamic domain we areinvestigating, are often irrelevant for the instructional support that isneeded.To model student teams, we have adapted a model from reactiveplanning research, the Situation Space [14, 9]. Roughly, a situationspace is a structuring of states of the world into classes of problemsolving histories whereby an agent's recognition of its current situa-tion guides its goal-directed behavior. The PuppetMaster models bothreactive and goal driven behaviors within the framework of a situa-tion space model of the student team. The current situation providesa top-down focus for monitoring, inferring missing information and as-sessing behavioral trends, based on appropriateness for the students'situation.2 The NeedThe domain in which we explored the design of our pedagogical agentis virtual world simulation for the training of military tank crews.These simulations are based on DIS technology whereby independentsimulators are tied together over a network, communicating low levelsimulation events to create a large-scale simulation environment. Thevarious entities interacting in this simulation environment can includeteams of students in tank simulators as well as synthetic forces (semi-autonomous forces) generated by software such as the ModSAF (Mod-ular Semi-Autonomous Forces) program [2].3

In a typical platoon exercise, there are four tank simulators, eachmanned by four students, and approximately �fty synthetic forces.Additionally, a team of Army instructors will manage the exercise,performing the roles of superior o�cer (e.g., sending out changes inorders), teacher (e.g., providing guidance after the exercise during areview session) and adversary (e.g, dynamically creating and taskingopposing forces).An exercise for a tank platoon might involve traveling in wedgeformation to a location in order to occupy a position which blocks theopposing force. While in a wedge formation, there are guidelines as tohow the individual tanks follow each other, keep each other in sight,etc. As the platoon travels to the position along the virtual terrain, itmay encounter friendly or opposing forces to which it has to respondappropriately. These forces can be either synthetic or human agentsand encounters with them can rapidly unfold in unpredictable ways.Likewise, the platoon may encounter terrain features that can slowdown and/or damage its vehicles. Thus, the multi-agent dynamicqualities of the simulation can rapidly impact what needs to be doneto satisfy the broadly scripted exercise, as well as whether it can besatis�ed.During the exercise, the instructors manage the simulation systemin order to extract information and modify ongoing events, often tomake the exercise more challenging. After the exercise, instructorswill review key events for the students on 3-D displays.Based on our observations of these exercises, instructors do notmicromanage or even micro-evaluate the students. For instance, theydon't analyze student behavior on an action-by-action basis unlessthe e�ect of a student's actions is something entirely anomalous inthe current situation. An example of the latter would be if a tankcomes to a halt while the platoon is supposed to be in a travelingwedge, thus disrupting the wedge. Finally, there are also individualdi�erences in how instructors evaluate speci�c skills in the trainingcontext.These characteristics are consistent with the nature of the domainand the skills that need to be acquired. Because of the dynamic, multi-agent domain, there is no guaranteed plan for achieving the goals setout for the students. Without such a plan, or other agreed upon ob-jective basis, micro-evaluation at the level of each and every individualaction is problematic. Furthermore, understanding the rationale for4

every action may require modeling the domain from the perspectiveof every student, e.g., down to the level of every rock which must becircumnavigated to avoid tread damage. Also, because of the dynamicdomain, it is necessary to foster the development of initiative in theteams consistent with situation-appropriate goals and behaviors.Although instructors do an excellent job of managing training ex-ercises, the tools that they have at their disposal are severely lacking.Instructors are often forced to write notes to themselves on paper asthe exercise proceeds, and must rely on subjective impressions. Also,as an exercise unfolds over time, relevant information concerning tran-sient events or trends may be missed.To see the obstacles that instructors currently face, consider Fig-ure 1 which presents a 3D virtual view that an instructor might haveof a platoon in a training exercise. One can see that the tanks are in aparticular formation (it is a wedge formation). Things that are hardto tell from this picture are:� How well have they been maintaining their formation over time?That requires an analysis of the formation over time, but onlywhen the formation is appropriate. A formation that is appro-priate when traveling may be inappropriate when engaging theenemy.� What is preventing them from maintaining good formation? Isit a problem in maintaining visual contact, di�culty navigatingthe terrain, or problems with the vehicles?� Are they in a position to spot the enemy?� Has the enemy spotted them?The information necessary to answer these questions is present inthe simulation, if one looks for it in the right place and the right time.The goal of our pedagogical agent is to extract the information fromthe simulation automatically and to analyze it from the perspective ofthe instructor, permitting the instructor to focus on deciding whereto provide instructional or adversarial feedback.3 The Probes ArchitectureIn order to understand how the PuppetMaster works, let us brie ylay out the overall architecture in which it works, the Probes system.5

Figure 1: An instructor's view of a platoon traveling in a wedge.The architecture of Probes is presented in Figure 2. The simulationexercise analysis is guided by an Exercise Task Description, which de-scribes the mission objectives and instructional objectives. The mis-sion objectives describe the students' goals and what is known aboutcurrent conditions. The instructional objectives provide a breakdownof the skills that need to be analyzed during particular phases/situationsin the exercise. The exercise task description is used to generate a sit-uation space.The PuppetMaster uses the situation space to determine what in-formation is required to recognize and analyze a situation. Monitor-ing instructions are sent to the Articulate Adaptive Monitors thatare embedded in the instrumented ModSAF entities. Depending ontheir instructions, these local monitors collect the data requested bythe PuppetMaster and report back, either on a regular basis, or wheninteresting events occur, or in response to queries from the Puppet-Master. The Puppet Master uses the data that it receives to interpretand assess the situation. Output to the instructor is controlled bythe Display Manager, which in turn sends commands to the displaysoftware.The situation space and its role in the PuppetMaster's evaluationis central to this paper. It is this aspect that we consider next.6

Figure 2: The Probes architecture.4 Situation SpacesAs developed in reactive planning research, Situation Spaces structurethe states of the world into situations and links between situations.The behavior of the agent is organized around its current situationwhich determines both the (sub)goal(s) it should try to achieve (ormaintain) and how to monitor the world. In turn, monitoring de-termines whether traversal to a new situation has occurred. Traver-sals could be caused by successful achievement of a (sub)goal, as wellas fortuitous or bad turn of events. Therefore, a traversal could beadvantageous or dis-advantageous in terms of achieving the overallgoal. Traversals through the situation space represent alternative ab-stract partial decomposition plans which can incorporate both purelygoal-directed behavior as well as goal-directed responses to unforeseenevents.To make these ideas more concrete, consider the situation spacedepicted in Figure 3 for the simpli�ed example problem presentedearlier. Recall the problem was to travel in a wedge formation to ablocking position, responding to encounters with the opposing forceen route.The current situation at the start of the exercise is assumed tobe \Traveling" (Wedge Formation). As noted above, the current sit-7

TravelingWedge Formation

AtGoal

Location

Disabled

Actionon Contact

Contact? Arrived?

Disabled?

Situation

TransitionFigure 3: Example Situation Space.uation determines the goals to achieve. When a platoon is in thisTraveling situation, their goals include traveling along some route to-wards the blocking position, and the maintenance goals of maintaininga wedge formation and scanning for the enemy. As they pursue thesegoals, various kinds of expected or un-expected transitions can occurwhich must be monitored for while in this situation. For instance, thestudents must monitor for reaching their objective, noted as the tran-sition to the Goal situation. Also, they must monitor for contact withother forces. This could cause a transition to an \Action on Contact".If that transition occurs, the goals in this new situation would includeassessing the threat, targeting opposing forces and insuring one is nota good target. (In actuality, actions on contact encompass di�erentkinds of responses and one might expand this situation into severalsituations in order to explicitly capture these di�erences.) Again asnoted by the transition arcs, when in the Action on Contact situation,transitions to \Disabled" or back to \Traveling" must be monitored.Even from this simple example, several characteristics about situ-ation spaces can be noted. There are multiple such paths through thespace, in fact, an in�nite number since looping is possible (betweentraveling and action on contact in this example). This compactlyexpresses the dynamic characteristics in these training environmentswhereby students can undergo repeated and unexpected encounters.The situation space models the necessary reactivity within an overallgoal-directed declarative plan. At the same time, the situation spacedoes not separately represent each possible plan { to do so would beimplausible. Rather, the situation space models sets of possible plansat a high level, with each situation modeling many possible problemsolving histories. Finally, the situation space is noncommittal con-8

cerning how the plan generation occurs within a situation or evenwhat planning architecture should be used. Rather, it is a frameworkin which goal-directed behavior can be planned for and realized in adynamic world.Of course, our interest in situation spaces does not follow from anintent to use them for single agent planning. Rather, we saw situationspaces as a good declarative model around which to organize a ped-agogical agent's analysis of team behavior in a dynamic world. As aconsequence, we needed to transform them from an aid for planningto an aid for analysis of team behavior.4.1 Situation Spaces from an Instructor's per-spectiveWe made two moves to use Situation Spaces to analyze Team behavior.First, we transformed the goals indicated by a situation from goals-to-be-achieved to behaviors-to-be-analyzed. For instance, the goalassociated with traveling in a wedge, along a route to the blockingposition, becomes an analysis goal to determine how well the team istraveling. Similarly, the action on contact goals of \targeting" and\avoid being an easy target" become how well they are targeting andavoid being targeted.The other move we made was to allow for multiple perspectives.This move is necessary because the team of students may have a dif-ferent perspective from the pedagogical agent as to what the currentsituation is, and such discrepancies provide key pedagogical assess-ments. For instance, the students may transition to an action oncontact situation and start �ring at what they think is the opposition,but actually are distant rocks. Meanwhile, the pedagogical agent, dueto its larger perspective, will know that the transition to action oncontact was a mistake. Conversely, the pedagogical agent may iden-tify a viable enemy threat which the students have not seen, perhapsdue to a failure in their scanning for the enemy.A consequence of maintaining multiple perspectives is that, forthose transitions that could result in a bifurcation of perspective, thereneeds to be a transition arc for each perspective. So for the transitionfrom the traveling to the action on contact situations there need tobe 2 arcs, one for the pedagogical perspective and one for the teamperspective. (Figure 3 shows only one of these arcs.)9

Another consequence of maintaining multiple perspectives is thatwe need to determine which situation is the appropriate basis for moni-toring and reporting analyses. Currently, we have found it su�cient toreport analyses based on the team perspective, for several reasons. Re-porting an analysis of the team from a situational perspective di�erentfrom that guiding their behavior tends to generate marginally usefulinformation. In contrast, analyzing team behavior from their per-spective generates the information that is needed to infer the point atwhich they �nally transition into the \correct" situation. For instance,when the team is traveling in a wedge, there are certain behaviors theyare supposed to exhibit such as constantly scanning their turrets andfollowing each other at a certain distance. These behaviors are nottypically appropriate during an active engagement so their cessationis a key indicator for inferring the transition of the team perspective.On the other hand, the pedagogical perspective is key in analyzingpotential failures of the team in assessing their situation.It has also been our experience that the bifurcation into two per-spectives persists only for very short periods of time. There are sev-eral reasons for this. The situation space is at a very high level ofabstraction so a bifurcation tends to indicate a dramatic misread ofthe environment. In addition, given the pressure that the domain ex-erts on the team's behavior, such a misread is likely not to persist forlong. For example, incoming rounds are a solid indicator to the teamthat those \rocks" o� in the distance are actually the opposition �ringat them.Allowing for these two perspectives has proved su�cient to datein characterizing the state of a platoon exercise. However, we couldgo further. Recall there are four tanks in a platoon. One might breakthe team perspective into individual tank perspectives, thus having thepedagogical agent track what it considers to be each tank's perspectiveon the current situation. Now discrepancies between tank perspectivesre ect factors such as breakdown in communication or coordination.This move towards multiple perspectives assumes that the perspec-tives share the same situation space. However, if there were multipleplatoons being assessed, performing di�erent missions, then di�er-ent situation spaces would be more appropriate. Conversely, we maymodel a single agents overall behavior as being composed of distinctsituations in multiple situation spaces. These points of course begthe question of composition over situations which would need further10

study.4.2 Situation Spaces in ProbesProbes uses the situation space to control its monitoring, form itsassessment of the team and selectively report that assessment. InProbes, a situation space is de�ned as a set of situations, with eachsituation being a 3-tuple consisting of:� List of transition predicate functions which might re-set the cur-rent situation and thereby cause a transition out of the situation� Set of evaluation functions for forming an assessment� Initialization function which invokes the monitors pertinent tothe situation and performs any necessary initializations.The situation space discussed here was designed as follows. TheArmy's instructional material laid out the mission goals, the kinds ofsituations (our term) that occur in pursuing the exercise goals alongwith the subset of behaviors that need to be assessed in those sit-uations. Reinforced by our observations of actual training sessions,this material essentially provided the structure of the situation spaceat a high level: the types of situations, as well as the nature of thetransitions and evaluations associated with those situations.From there, the actual transition, evaluation and initializationfunctions were de�ned, roughly in that order. For instance, the tran-sition from the \Travel" to \Action on Contact" situations requiredProbes to determine the conditions under which transitions could beassumed for both team perspective and pedagogical perspectives. Ifthe information to make these determinations was unavailable or hid-den, then a means of inferring the transition was developed, in par-ticular by detecting changes in behavioral trends.Next, evaluation functions were de�ned. For instance, the instruc-tional material and our observations revealed the importance of evalu-ating travel formation, as well as how it was assessed in doctrine and inpractice. Additionally, a limited set of data from actual exercises wasavailable that included instructor evaluations on wedge formations.That data was run through a system for inducing oblique decisiontrees [10] and the resulting decision tree was used as an additionalsource of information, in order to objectively characterize what theinstructors' considered important. Based on these various sources, we11

created an evaluation function that assessed the key characteristics ofa wedge and re�ned it by testing against ModSAF tank platoons. Todelimit the information being presented to instructors, the reportingof the assessment was restricted to when those characteristics quali-tatively changed.Given the information used by the transition and evaluation func-tions, the initialization functions and associated monitors could be de-�ned. The various monitors used by Probes were built into ModSAFand invoked via a software interface. As examples, periodic moni-tors would report the team's location, event driven monitors wouldreport when the team had been spotted by an opposing team andasynchronous monitors were used to assess intervisibility between ve-hicles.Across all the exercises for platoon training, the instructional ma-terial was laid out in a modular fashion that shared exercise goals,situations and assessment criteria. This suggests that the applicationof situation spaces across the full suite of platoon exercises could besemi-automated.4.3 Situation-based vs Event-based ModelingSeveral characteristics of this training domain lead us to structure as-sessment around a situation space. Due to the dynamic, unpredictablenature of the domain, there are no set plans for students to learn whichcould form the basis for student evaluation. Nevertheless, some struc-turing is necessary to ascertain, for instance, whether students aregetting closer to their goals. Further, it was necessary to deal withthe volume of information and to account for missing information.By structuring the state space into a situation space, monitoringis organized top-down around the situation, which in turn usefullyconstrains interpretation and assessment. Behavioral trends can bemonitored and assessed according to their appropriateness within asituation. Changes in behavioral trends within a situation can be usedto infer missing information, such as whether the platoon has spottedthe enemy and is going into action on contact. Unnecessary detailsabout the state of the world are not monitored. And the transitionsbetween situations provide a principled basis of coupling analysis ofplanned and reactive behaviors.A key feature of our approach is that the high level analysis ap-12

propriate to the situation is based mainly on partial state descriptionsand trends in those descriptions. This is quite di�erent from workon student modeling and plan recognition which use a complete ac-tion model as the basis of checking/inferring a plan with an agent'sactions. These action-based approaches are not as tolerant of incom-plete knowledge nor can they easily infer from low level actions thekinds of higher level analysis that are important for the instructorshere.For instance, recognizing that a platoon is trying to travel in awedge formation would be at best di�cult if the recognition was basedon the low level actions that the 16 crew members in the 4 tanks wereexecuting. And evaluating this team behavior is best done at the levelof the abstract, partial state descriptions, especially trends in thosedescriptions, and not at the level of individual discrete actions thesevarious teammembers are performing. The relation between the levelsis not strongly �xed. Moreover, recognition and evaluation is bestdone at the level at which this joint behavior has consequences in thisdynamic environment. Still, causal analysis at the level of individualactions would be of diagnostic utility but only in the context of thehigher level analysis and not as a way of deriving the higher levelanalysis. Some form of constrained plan recognition, for example,may be useful for determining why a wedge formation is falling apart.5 ExampleLet's now consider an example run of the Probes system using a sit-uation space to support an instructor. In the following example, theteam is training on the exercise discussed earlier, traveling in a wedgeto a blocking location. Blue is the platoon being assessed whereas Redis the opposition. Here, both sides are comprised of synthetic agentsbeing run by ModSAF, but in an actual exercise some of the agentswould be human crews in simulators.The Probes system provides instructors with a coordinated setof presentations. Figure 4 shows two presentations: the situationspace for the exercise on the left, with the current situation (teamperspective) highlighted. On the right is a log of high-level eventsand assessments. It is the role of the PuppetMaster to automaticallydetermine which situation is currently in force, by monitoring the13

activities in the simulation. Probes also uses synthesized speech tonotify the instructor of state changes, so that instructors know thecurrent situation regardless of where they are gazing.Figure 4: Probes' Situation Presentation.The exercise analysis log records events and analyses that are likelyto be relevant to the current situation. Note that in the example,when Blue initiates Situation Travel (Wedge) (i.e., starts traveling ina wedge formation), the PuppetMaster reports that vehicles in theplatoon are in a poor wedge formation. This analysis is performedonly when PuppetMaster recognizes the platoon should be in a wedge.In contrast, when Situation Act occurs (i.e., Blue is in a Action onContact Situation) PuppetMaster starts reporting that the platoon isstill in a wedge when it should probably be going \on line", in e�ectmodifying their formation in a fashion consistent with the exigenciesof an active engagement. At this point it stops assessment of the thewedge alignment, since the wedge formation is no longer appropriatefor the current situation.These analyses use partial state descriptions and trends in thosedescriptions, in particular, persistence in the relations between tanksover time. The analyses are not trying to evaluate (or infer) travelin a wedge by reasoning about the actions which the four tanks areexecuting (or the 16 crew members in the case of human platoons).Also note how the exercise analysis log reports when the red force(the opposing force) spots the blue force (the platoon being assessed)and vice versa. PuppetMaster monitors not just the movement of thevehicles, it also accesses state information internal to the vehicle sim-ulations. This allows PuppetMaster to report on the opposing forcestactics and situation assessments. When Red spots Blue, their in-ternal assessment recognizes a threat and, based on that assessment,14

the PuppetMaster's pedagogical perspective transitions to Action onContact (\ActualSit Act"). The pedagogical perspective also wouldhave transitioned if Blue's team perspective had transitioned and thetransition was valid (e.g., the opposing force did not turn out to berocks). In the case of the Blue forces (especially when they are hu-man crews), PuppetMaster infers their intent based on the currentsituation, the objectives of the exercise and trends in the partial statedescriptions that are being monitored. For instance, if Red can (po-tentially) be spotted and is in range, plus Blue has turrets aimingat Red, breaks out of wedge formation, or attens the wedge, then asituation transition for Blue's perspective can be inferred.As the situation changes, Probes automatically displays statisticsabout the unit's performance as appropriate to the current situation.For example, when the platoon is traveling in a wedge formation adiagram appears, such as the one in Figure 5, that shows the distancesbetween tanks and the overall depths of the wedge.Figure 5: A wedge formation gauge.If the instructor wishes to see more information about an observedevent, he can obtain it simply by clicking on the event noti�cationin the event log. This causes Probes to do some shallow reasoningand bring up an additional window showing local terrain conditions,damage to the vehicles, etc. (See Figure 6.) This information helps theinstructor evaluate whether or not the trainee behavior is appropriateunder current circumstances. This same capability may be used duringthe subsequent review with students.Ideally, Probes analysis can be coupled to a 3D (stealth) view ofthe virtual world. Thus, the instructor could click on an event in the15

Figure 6: Probes' presentation of speci�c details concerning awed wedgeformation.exercise analysis log and see a 3D view of the situation at that time.We have experimented with ways of annotating the 3D display withrelevant analysis data but the details are beyond the scope of thispaper.6 StatusThe Probes system has been fully implemented, with all the capabil-ities illustrated in the \Example" section. We are currently pursuingcollaborations that will allow us to evaluate Probes in various trainingcontexts.Our research is also focusing on techniques for automating the con-struction of situation spaces, including the use of a scenario generationas a front-end to the situation space construction [11]. We are studyingthe implications of applying situation spaces to more complex teamsof students where individual students would be best modeled as dis-tinct situation spaces. In particular, we are concerned with how toformalize the relation between these situation spaces. Work in formalmodels of teamwork (e.g., [8]) may potentially be useful here.We expect that our design will have applicability to pedagogicalagents in domains that share the unpredictable nature of team train-ing of tank platoons. To evaluate that, we are currently studyingthe applicability of the method to other domains, including Robocupsoccer simulations [6]. 16

7 Related WorkThis work is closely related to intelligent tutoring systems that employsome form of student modeling (e.g., [1, 12, 17, 3]) Student modelingtypically assumes some task that needs to be performed and someoverall plan(s) of action or method(s) are known to achieve that task.In turn, the plan is used to interpret student actions so as to deter-mine, for example, when a student is making a mistake or is at animpasse. Model tracing [1] di�ers somewhat from this approach inthat a production system is used as a cognitive model of the methodand the events generated by its execution behavior are matched to thestudent's behavior. Such approaches are tied to detailed modellingand matching of the actions or events that will occur. Therefore, theytend to have ready application when the skill to be acquired can berelatively tightly scripted both in terms of what speci�c actions needto be taken and the order in which they should be taken. Further,they presume a complete model of the problem solving task in theform of a plan or a production system which can be used to drawinferences about student behavior.The pedagogical goals for the domains we have been considering,however, represent the other extreme. On the battle�eld, there aremultiple tasks to be performed and multiple agents/teams performingthose tasks. Moreover, there is no overall plan of action either atthe individual or team level that can be guaranteed to achieve thegoal; events can enfold in such a way that obviates any plan. Ratherstudent behavior is tied to the situation the student teams are in. Thesituation strongly impacts what goals need to be achieved and whattasks are to be performed. This view gains support from the fact thatit is consistent with how the Army lays out its exercise plan as wellas how it evaluates the student teams.There are student modeling techniques that lie between these ex-tremes (e.g., [3, 12]). For instance, Situated Plan Attribution [3] isa system for tutoring satellite link operators. Because of the interac-tive, dynamic nature of the operator's task, Situated Plan Attributionattempts to loosen how actions and their goals are �t into an overallplan. Nevertheless the intent to do a detailed �tting still exists and theplan structure, a temporal dependency network, poses stricter causalrelations than the Situation Space places on the transitions betweensituation. The di�erences between the two techniques follow rather17

directly from di�erences in the tutoring domains they focus on andthe kinds of tutoring interaction that is required in those domains.Our work is somewhat more loosely related to plan recognition(e.g., [13, 5] and agent tracking (e.g., [15]). Plan recognition involvesinferring an agent's unknown plan based on their actions. Our peda-gogical agent work di�ers in several ways. The intent is not to infera plan. We have knowledge of the abstract plan. Rather the issue isevaluating how well the various student behaviors serve the objectivesof the plan. Furthermore, plan recognition often presumes plans thatare step-by-step procedures. This is inconsistent with the nature of thereactive plans we are considering { very abstract goal decompositionsrepresented only indirectly by alternative paths through the situationspace. Our concern with reactive components in behavior is shared bythe agent tracking work (e.g., [15]) which does not assume plans arestep-by-step procedure but rather a mix of plan-based and reactiveprocedures. In contrast to the intent of our work, agent tracking doesshare with plan recognition the goal of inferring an agent's plan. As aconsequence, it assumes access to the information and agent's actionmodels that can support that analysis. Finally, in such action-basedapproaches, there would need to be a way to �t the modeling of anindividual agent's behavior into the modeling of team behavior, whichis our focus.8 Concluding RemarksSimulation-based training of teamwork skills for dynamic, unpredictablemulti-agent settings present special challenges from a pedagogical stand-point. It is our view that the instructor can bene�t immensely froma pedagogical agent that is an intelligent interpreter of events in thesimulation. Further, only a synthetic agent can play this role: humanagents have great di�culty even assimilating the information comingfrom all the agents' viewpoints and as a consequence have di�cultyassessing that information. To be useful, such a pedagogical agentalso needs to be able to interpret events from a viewpoint of instruc-tional objectives. Finally, the agent must be able to handle missinginformation as well as di�erent kinds of agent architectures.We have developed a pedagogical agent that can assess a trainingsession from the standpoint of instructional objectives. This pedagog-18

ical agent uses a situation-based approach to modelling behavior thatis consistent with the information it has available and the analyses itneeds to perform.

19

References[1] Anderson, J.R., Boyle, C.F., Corbett, A.T. & Lewis, M.W. R.Cognitive Modelling and Intelligent Tutoring. In Arti�cial Intel-ligence, 42(1), 1990. Pp. 7{49.[2] Calder, R.B., Smith, J.E, Courtemanche, A.J., Mar, J.M.F. &Ceranowicz, A.Z. ModSAF Behavior Simulation and Control.In Proceedings of the Third Conference on Computer-GeneratedForces and Behavioral Representation. Orlando, Fla: Universityof Central Florida Institute for Simulation and Training, 347-356.[3] Hill, R. & Johnson, W.L. Situated plan attribution for intelligenttutoring. In Proceedings of the National Conference on Arti�cialIntelligence. Menlo Park, CA: AAAI Press, 1994.[4] Hollan, J.D., Hutchins, E.L., & Weitzman, L. STEAMER: An In-teractive Inspectable Simulation-Based Training System. In TheAI Magazine, Pp. 15-27. AAAI, Seattle WA, Summer, 1984.[5] Kautz, H. & Allen, J.F. Generalized Plan Recognition. In Pro-ceedings of the National Conference on Arti�cial Intelligence.Menlo Park, CA: AAAI Press, 1986.[6] Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I. & Osawa,E. RoboCup: The Robot World Cup Initiative. In Johnson,W.L. (Ed.) Proc. of The First International Conference on Au-tonomous Agents. New York: The ACM Press, 1997. Pp 340{347.[7] Lester, J.C. & Stone, B.A. Increasing Believability in AnimatedPedagogical Agents. In Johnson, W.L. (Ed.) Proc. of The FirstInternational Conference on Autonomous Agents. New York: TheACM Press, 1997. Pp 16{21.[8] Levesque, H.J,, Cohen, P.R. & Nunes, J.H.T. On Acting To-gether. In Proceedings of the Eighth National Conference on Ar-ti�cial Intelligence (AAAI-90). Los Altos: CA, Morgan Kaufman,1990. Pp. 94{99.[9] Marsella, S. C. & Schmidt, C.F. Reactive planning using a "Sit-uation Space". In Hendler, J. (Ed.) Planning in Uncertain, Un-predictable, or Changing Environments (Working Notes of the1990 AAAI Spring Symposium). Available (preferably) as SRCTR 90-45 of the Systems Research Center, University of Mary-land, College Park, Maryland. 1990.i

[10] Murthy, S. K., Kasif, S. & Salzberg, S. A System for Inductionof Oblique Decision Trees. In Journal of Arti�cial IntelligenceResearch, 2, 1994, Pp. 1-32.[11] Pautler, D., Woods, S. & Quilici, A. Exploiting Domain-Speci�cKnowledge To Re�ne Simulation Speci�cations. Accepted at TheTwelfth IEEE International Conference on Automated SoftwareEngineer-ing, Incline Village, Nevada November 1997. Also available athttp://spectra.eng.hawaii.edu/ alex/Research/Abstracts/aseconf97.html.[12] Rickel, J. & Johnson, J.L. Integrating Pedagogical Capabilitiesin a Virtual Environment Agent. In Johnson, W.L. (Ed.) Proc. ofThe First International Conference on Autonomous Agents. NewYork: The ACM Press, 1997. Pp 30{38.[13] Schmidt, C. F., Sridharan, N.S. & Goodson, J.L. The plan recog-nition problem: an intersection between psychology and arti�cialintelligence. Arti�cial Intelligence, 11(1,2), 1978. Pp. 45{83.[14] Schmidt, C.F., Goodson, J.L., Marsella, S.C., Bresina, J.L., Re-active Planning Using a Situation Space. In Proceedings of the1989 AI Systems in Government Conference, IEEE, 1989.[15] Tambe, M. & Rosenbloom, P. Architectures for Agents that TrackOther Agents in Multi-Agent Worlds. In Wooldridge, M., Muller,J.P. & Tambe, M. (Eds) Intelligent Agents II, Agent Theories,Architecture, and Languages. Berlin: Springer-Verlag, 1996. Pp.156{170.[16] Tambe, M., Johnson, W.L., Jones, R., Koss, F., Laird, J. E.,Rosenbloom, P.S. & Schwamb, K. Intelligent Agents for interac-tive simulation environments. In The AI Magazine, 16(1), Spring,1995.[17] VanLehn, K. Student Modelling. In Polson, M.C. & Richardson,J.J. (Eds.), Intelligent Tutoring Systems. Hillsdale, NJ: LawrenceErlbaum, 1988. Pp. 55{78. ii

An Instructor’s Assistant for Team-Training in Dynamic Multi-agent Virtual Worlds

Documents

Transcript of An Instructor’s Assistant for Team-Training in Dynamic Multi-agent Virtual Worlds