Modeling Facial Expression of Intelligent Virtual Agents

12
O. Castillo et al. (Eds.): Soft Computing for Intell. Control and Mob. Robot., SCI 318, pp. 303–313. springerlink.com © Springer-Verlag Berlin Heidelberg 2010 Modeling Facial Expression of Intelligent Virtual Agents María Lucila Morales-Rodríguez, Fabián Medellín-Martínez, and Juan J. González B. Instituto Tecnológico de Ciudad Madero, División de Estudios de Posgrado e Investigación, Cd. Madero, Tamaulipas, Mexico {lmoralesrdz,fabermed}@gmail.com, [email protected] Abstract. An intelligent virtual agent is an intelligent agent with a digital repre- sentation and endowed with conversational capabilities; this interactive character exhibits human-like behavior and communicates with humans or virtual interlocu- tors using verbal and nonverbal modalities such as gesture and speech. Building credible characters requires a multidisciplinary approach, the behaviors that must be reproduced are very complex and must be decoded by their interlocutors. In this paper we introduce a kinesics model in order to select the nonverbal expres- sion of intelligent virtual agents able to express their emotions and purposes using gestures and face expressions. Our model select the nonverbal expressions based on causal analysis, and data mining algorithms applied to perceptual studies in order to identify significant attributes in emotional face expressions. 1 Introduction The intelligent virtual agents are increasingly present on the user interfaces be- cause they try to simulate the forms of human communication and thus to improve the human-machine interaction. Their design involve a multidisciplinary approach combining computer sciences, psychological, sociological and philosophical is- sues [1], the challenge is to endow them with humans characteristics like emotions and personality, and thus to allow the suspension of the disbelief. An intelligent virtual agent is an intelligent agent with a digital representation and endowed with conversational capabilities; when this interactive character ex- hibits human-like behavior and communicates with humans or virtual interlocutors using verbal and nonverbal modalities such as gesture and speech they are known as Embodied Conversational Agent (ECA) [2]. These kinds of agents try to trans- mit an illusion of life and reproduce the richness and the dynamics of the human

Transcript of Modeling Facial Expression of Intelligent Virtual Agents

O. Castillo et al. (Eds.): Soft Computing for Intell. Control and Mob. Robot., SCI 318, pp. 303–313. springerlink.com © Springer-Verlag Berlin Heidelberg 2010

Modeling Facial Expression of Intelligent Virtual Agents

María Lucila Morales-Rodríguez, Fabián Medellín-Martínez, and Juan J. González B.

Instituto Tecnológico de Ciudad Madero, División de Estudios de Posgrado e Investigación, Cd. Madero, Tamaulipas, Mexico {lmoralesrdz,fabermed}@gmail.com, [email protected]

Abstract. An intelligent virtual agent is an intelligent agent with a digital repre-sentation and endowed with conversational capabilities; this interactive character exhibits human-like behavior and communicates with humans or virtual interlocu-tors using verbal and nonverbal modalities such as gesture and speech. Building credible characters requires a multidisciplinary approach, the behaviors that must be reproduced are very complex and must be decoded by their interlocutors. In this paper we introduce a kinesics model in order to select the nonverbal expres-sion of intelligent virtual agents able to express their emotions and purposes using gestures and face expressions. Our model select the nonverbal expressions based on causal analysis, and data mining algorithms applied to perceptual studies in order to identify significant attributes in emotional face expressions.

1 Introduction

The intelligent virtual agents are increasingly present on the user interfaces be-cause they try to simulate the forms of human communication and thus to improve the human-machine interaction. Their design involve a multidisciplinary approach combining computer sciences, psychological, sociological and philosophical is-sues [1], the challenge is to endow them with humans characteristics like emotions and personality, and thus to allow the suspension of the disbelief.

An intelligent virtual agent is an intelligent agent with a digital representation and endowed with conversational capabilities; when this interactive character ex-hibits human-like behavior and communicates with humans or virtual interlocutors using verbal and nonverbal modalities such as gesture and speech they are known as Embodied Conversational Agent (ECA) [2]. These kinds of agents try to trans-mit an illusion of life and reproduce the richness and the dynamics of the human

304 M.L. Morales-Rodríguez, F. Medellín-Martínez, and J.J. González B.

behavior [3]. Characters able to transmit this illusion could more easily produce an immersion’s experience in a virtual world. They are being used in educative applications like virtual tutors, or as presenters or guides in visits to virtual muse-ums. Also they could be found as salesmen or real estate agents, personal trainers and therapists, among others applications.

These agents become visible in the social interfaces like animated actors who represent human beings and are used to improve the interactivity with the users. In order to improve this interactivity, it is not enough that the virtual character looks real to be believable and catch the interest of the user. The expression of the inter-action must be given in a natural way, for that reason, the character must behave and communicate in the same way that a human do, using verbal and nonverbal ex-pressions. Gratch et al [1] identify that the face-to-face conversation, emotions and personality, and human figure animation are the key issues that must be addressed in creating virtual humans.

Following these issues, the creation of virtual characters has evolved, starting with the creation of fantasy characters until virtual humans. The first projects de-veloped worked with a specific set of facial expressions from a corpus of facial ex-pressions product of the work conducted by Ekman and his colleagues[4]. These projects were talking faces trying to reproduce emotional expressions [5-7] and specific nonverbal functions in order to synchronize the verbal expression [8]. Other works express blend of emotions of different types and performative of the communicative act being performed [9].

The interest of our work is the selection of the facial expression of a virtual character that can express itself in a nonverbal way his emotional context coher-ence. The inherent difficulty to this task occurs because the human being can experience a variety of emotions, which are represented by a wide range of expressions, and are influenced by the traits of personality and the context.

In this paper we introduce a kinesics model in order to model the nonverbal ex-pression of intelligent virtual agents able to express their emotions and purposes using gestures and face expressions. Our model is based on causal analysis, and data mining algorithms applied to perceptual studies in order to identify signifi-cant attributes in emotional face expressions and thus design a selection model of facial expressions.

2 Expressing Emotional Behavior

There are several ideas about what can give believability to virtual agents. For Badler [3], what makes a virtual human « human » is not just a well-executed ex-terior design but movements, reactions, and decision-making which appear « natu-ral », appropriate, and contextually-sensitive. To be able to produce this behavior, these life-like characters must communicate credibility in their emotional expres-sions (coherent reaction), exhibit personality and be able to express their indexical

Modeling Facial Expression of Intelligent Virtual Agents 305

and reflexive behavior, which are transmitted by verbal, paraverbal and non verbal activity, showed through facial, vocal and gestural expressions.

In this work we used a behavioral model proposed by Morales [10] to handle the perceptions of the agent and determine its actions, on the basis of his emo-tional state and traits of personality. This model works with a series of attributes that define the emotion, mood, attitude, intention and objective of the agent. These elements models the behavior and are constantly updated, we characterized and used them for the selection of the emotional expression.

The Fig. 1 shows the general architecture of a virtual character, where the out-put data of the behavioral model are taken by a Kinesics model in order to select and integrate the no verbal expression of the virtual character.

Fig. 1. Components of a general architecture for the animation of a virtual agent

An interesting detail to being mentioned is the way in that the behavioral model handles the emotional state of the agent. In order to handle and to represent the emotions of the agent a circumflex model is used, divided in eight regions by four axes that represent basics psychological concepts. According to this approach, an emotion is represented by a set of psychological characteristics that allow to define it and to classify it. The circumflex in Figure 2 shows the distribution of the top four recognized emotions: fear, anger, sadness and joy.

The emotion label selection is based on its position in the space delimited by the four axes. For example, the emotion of anger is in the portion of space charac-terized by positive values of stress, arousal and stance, as well as a negative value of valence.

306 M.L. Morales-Rodríguez, F. Medellín-Martínez, and J.J. González B.

Fig. 2. Classification of the emotions in four dimensions according to the proposed circumflex model in Morales [10]

In this paper we detail the process related to the facial expression in the kine-sics model, we examined the existing relations between the face expressions and the attributes given by the behavioral model like intention, objective, emotion, mood and attitude, these relations are going to be used to determine the design of the selection model.

The facial expressions are created by all the possible combinations of a total of 5 eyebrows and 4 mouths shapes, giving a total of 20 different images (see Fig. 3).

Fig. 3. The twenty resulting faces of the combinations of the different eyebrows and mouth shape

3 Characterization of Behavioral Attributes

In order to analyses the existing relations between the face expressions and the at-tributes given by the behavioral model, we implemented a survey where the

Modeling Facial Expression of Intelligent Virtual Agents 307

participants characterized the set of twenty images using the attributes attitude, intention, mood and emotion of the behavioral model.

The information generated, was used to develop a causal analysis to observe the weight of each attribute. We did the causal analysis using Tetrad1 (Fisher’s Z test, alpha value of 0.01) and we obtain that the attribute « mood » is the most significant of all (Fig. 4).

Fig. 4. Causal analysis of the attributes that characterized the face expressions

We also apply data mining using Naive-Bayes and ID3 classifiers to identify the best method to predict the expression that corresponds with the attributes of the behavioral model. The Fig. 5 shows the percentage of success of these classifi-ers, we note that the Naive-Bayes classifier was the best classifier with a 50% of success.

Fig. 5. Success percentage of classifiers

Although the Naive-Bayes classifier has low performance, we can observe that the predicted images do not vary too much from the expected one (Table 1).

1 In http://www.phil.cmu.edu/projects/tetrad it is possible to find more

information and free download software of Tetrad Project.

308 M.L. Morales-Rodríguez, F. Medellín-Martínez, and J.J. González B.

Table 1. Comparison between the results obtained by the Naive-Bayes classifier with the image that we expected in the cases in that the classifier failed

In order to distinguish the face expressions, we develop another survey that ask to the participants the identification of the emotion expressed by the face through the characterization of the four axes showed in the circumflex. With the new in-formation we can built a new circumflex, with the most popular answers of the survey (See Fig. 6).

Fig. 6. New allocations of the emotional labels in the circumflex

Expected Obtained Expected Obtained

Modeling Facial Expression of Intelligent Virtual Agents 309

This last survey, gives us the information to verify the interpretation of an emotion based on the four dimensions that characterize the circumflex. Now we are able to use the data of the emotional axes to identify an emotion and their in-tensity. This information has been used in the design of the selection model.

4 Design of the Face Expression Selection Model

The selection model of face expressions works with the information generate by the behavioral model and with the corpus product of the surveys.

The model uses the corpus that has been classified by the Mood and by the four axes that characterizes an emotion. This corpus is processed in two layers in order to reduce it and treated by a Naïve Bayes algorithm.

The face expressions selection model can be appreciated in Figure 7 and their main algorithm in Table2.

Fig. 7. Structure of the Selection Model of Facial Expression

The first layer of the model apply a mood filter, where we will choose only the images that were characterized with the current mood value of the behavioral model.

We have chosen mood as the filter attribute, because of the causality tests. The result of this filter will give us a sub catalogue of expressions.

310 M.L. Morales-Rodríguez, F. Medellín-Martínez, and J.J. González B.

Table 2. Algorithm of the selection model of face expressions

Facial_Expression_Model(Facial_Ex_DB, attributes ) 1. For each gesture ∈ Facial_Ex _BD such that moodgesture = moodattributes 2. add gesture into First_Candidates_ Facial_Ex 3. end for each 4. Final_ Facial_Ex ExpressionSelection ( Facial_Ex _DB , First_Candidates_Gestures, attributes ) 5. return Final_ Facial_Ex

This sub catalogue and the attributes of attitude, intention and emotion are used in the second layer : (see Fig. 8).

Fig. 8. Structure of the Selection Method of Facial Expression

This second process selects from the database the face expression that better match to the label of the emotion, intensity and attitude. In Table 3 we observed the detail of the selection process.

Table 3. Function ExpressionSelection

ExpressionSelection (First_Candidates_ Facial_Ex, attributes) 1. Second_Candidates_ Facial_Ex TopRankingNaiveBayes(First_Candidates_ Facial_Ex, attributes) 2. Final_ Facial_Ex FaceExpSelection(Second_Candidates_ Facial_Ex, attributes) 3. return Final_ Facial_Ex

This method calls a Naive Bayes Classifier to select the three better expressions (function TopRankingBayes, see Table 4), these face expressions could be variations of a same emotion affected by its intensity.

Modeling Facial Expression of Intelligent Virtual Agents 311

Table 4. Function that return the three Top Ranking of NaiveBayes

TopRankingNaiveBayes (Facial_Ex, A ) 1. For each f∈ Facial_Ex 2. Probabilitiesf 0 3. For each a ∈ A 4. Probabilitiesf Probabilitiesf + ObtainProbability(a) 5. end for each 6. End for each 7. Order Facial_Ex by Probabilites 8. return Facial_Ex 0 and Facial_Ex 1 and Facial_Ex 2

The next step uses the intensity of the emotion to select between the group of three facial expressions with the highest Naive Bayes probabilities, see the function FaceExpSelection (Table 5).

Table 5. Algorithm with the final selection

FaceExpSelection (Facial_Ex _DB , Candidates_ Facial_Ex, A ) 1. Intensity EmotionIntensity(ArausalA, StressA, StanceA, ValenceA, α) 2. For each gesture ∈ Candidates_Facial_Exgesture 3. sum 0 4. For each emotion ∈ Gestureemotions 5. Probabilities Probabilityemotion * Emotionintensity 6. sum sum + Probabilities 7. end for each 8. Probabilitygesture sum 9. end for each 10. return Max Probabilitygesture ∈ Candidates_Facial_Exgesture

The intensity of the emotions involved in the three better facial expressions is calculated by the operation described in Table 6. This calculus take the axes val-ues related with the definition of the emotion. The intensity is obtained from a sum of the values of the four axes, where the axes are going to have a different weight. The adjacent axes to the label of the emotion have a smaller weight than the nonadjacent axes.

Table 6. Intensity calculation of each emotion label

Emotion Label Intensity Calculation Anger (α)[(-1)(Valence) + Stance]+(1-α)[Stress + Arousal]} Surprise (α)[Valence + Stress]+(1-α)[Stance + Arousal] Joy (α)[Arousal + (-1)(Stress)]+(1-α)[ Stance + Valence] Pleasure (α)[(-1)(Arousal)+Stance]+(1-α)[(-1)(Stress)+ Arousal] Relaxation (α)[Valence +(-1)(Stance)]+(1-α)[(-1)(Stress)+(-1)(Arousal)] Sadness (α)[(-1)(Valence)+(-1)-(Stress)]+(1-α)[(-1)(Stance)+(-1)(Arousal)] Distress (α)[(-1)(Arousal) + Stress]+(1-α)[(-1)(Stance)+(-1)(Valence)] Fear (α)[Arousal +(-1)(Stance)]+(1-α)[Stress +(-1)(Valence)]

5 Experimental Results

In Table 7, we present a sequence of executions of the Behavioral Model and the experimental results obtained.

312 M.L. Morales-Rodríguez, F. Medellín-Martínez, and J.J. González B.

Table 7. Executions of the system

Input Axes Naïve Bayes Candidate Images Final Selection

Mood: Negative Emotion: Fear Attitude: Directive Intention: Demand

Stress : 0.639 Arousal: 9.0 Stance: -1 .164 Valence: -4 .343

Naive Bayes 0.00291307382 Intensity 0.0523153323

Naive Bayes 0 .001356211169 Intensity 0 .013809930878

NaiveBayes 0.00120193359 Intensity 0.083002777777

Mood: Negative Emotion: Fear Attitude: Directive Intention: Inform

Stress: 0.513 Arousal : 9.0 Stance : -0 .156 Valence : -1 .156

NaiveBayes 0.00145653691 Intensity -0.032766633

NaiveBayes 0 .000997146937 Intensity 0 .014757436922

Naive Bayes 0.000801289062 Intensity 0.063547222222

Mood: Neutral Emotion: Fear Attitude: Directive Intention: Demand

Stress : 0.41 Arousal : 9.0 Stance : -0 .125 Valence : -0 .978

Naive Bayes 0.0000000495 Intensity -0.03043611111

Naive Bayes 0 .00000002828 Intensity -0.0778872237

Naive Bayes 0.0000000165 Intensity -0.0611533859

Mood: Neutral Emotion: Fear Attitude: Directive Intention: Reprimand

Stress : 0.363 Arousal : 9.0 Stance : -0 .123 Valence : -0 .907

Naive Bayes 0.00015470312 Intensity -0.11520138888

Naive Bayes 0 .000000028288 Intensity -0.0794237301

Naive Bayes 0.000000090008 Intensity -0.06280523702

The table shows the inputs to the selection model, the three images with the best probabilities according to Naive Bayes classifier and the image of the final selection. Each image presents their probabilities and emotional intensity values. The final selection is the image with the maximum intensity.

In these examples we observe that in some cases, the final image selected does not correspond with the image of the maximum probability of the Naive Bayes classifier, this could be influenced by the use of several inputs (attributes) with same weight in the classifier. The addition of the emotional intensity gave to the final selection a greater importance to the emotional state and define the final selection. This procedure follows the results of the causal analysis, where the emotion is second attribute with a greater weight.

6 Conclusions

In this paper we present a model of facial selection as part of a kinesics model for intelligent virtual agents able to express their emotions and purposes using gestures and face expressions.

Modeling Facial Expression of Intelligent Virtual Agents 313

The difficulty of this task occurs because the human being can experience a variety of emotions, which are represented by a wide range of expressions, which are influenced by the traits of personality and the context. Our model selects the face expressions influenced by a causal analysis of the attributes implied in the behavioral model of the virtual character and by a Naive Bayes Classifier. The se-lection model processing a face corpus according to the weight of these attributes and intensity. The model searches provide a contribution to the improvement of the virtual humans believability.

Acknowledgements

This works has been product of the project 2457.09-P supported by the Dirección General de Educación Superior Tecnológica (DGEST) de la Secretaría de Educación Pública (SEP) and the Instituto Tecnológico de Ciudad Madero.

References

1. Gratch, J., et al.: Creating Interactive Virtual Humans: Some Assembly Required. IEEE Intelligent Systems 17(4), 54–63 (2002)

2. Cassell, J., et al. (eds.): Embodied Conversational Agents, vol. 426. MIT Press, Cambridge (2000)

3. Badler, N.: Virtual humans for animation, ergonomics, and simulation. In: IEEE Workshop on Non-Rigid and Articulated Motion, Puerto Rico (1997)

4. Ekman, P., Friesen, W.V., Hager, J.C. (eds.): Facial Action Coding System: The Manual and User Guide. Salt Lake City, Human Face (2002)

5. Bui, T.D., et al.: Generation of Facial Expressions from Emotion Using a Fuzzy Rule Based System. In: Australian Joint Conference on Artificial Intelligence. Springer, Adelaide (2001)

6. De Rosis, F., et al.: From Greta’s Mind to her Face:Modelling the Dynamics of Affec-tive States in a Conversational Embodied Agent. International Journal of Human-Computer Studies 59(1-2), 81–118 (2003)

7. Pelachaud, C., Bilvi, M.: Computational Model of Believable Conversational Agents (2002)

8. DeCarlo, D., et al.: Specifying and animating facial signals for discourse in embodied conversational agents. Computer animation and virtual worlds 15, 27–38 (2004)

9. Pelachaud, C., Poggi, I.: Facial Performative in a Conversational System. In: Work-shop on Embodied Conversational Characters, WECC 1998, Lake Tahoe, USA (1998)

10. Morales Rodríguez, M.L.: Modèle d’interaction sociale pour des agents conversation-nels animés. Application à la rééducation de patients cérébro-lésés. PhD Thesis. PhD These: Institut de Recherche en Informatique de Toulouse, Université Paul Sabatier, p. 108 (2007)