Detecting Concealment of Intent in Transportation Screening: A Proof of Concept

17
Detecting Concealment of Intent in Transportation Screening: a Proof-of-Concept Judee K. Burgoon, Douglas P. Twitchell, Matthew L. Jensen, Mark Adkins, John Kruse, Amit Deokar, Gabriel Tsechpenakis, Shan Lu, Dimitris N. Metaxas, Jay F. Nunamaker Jr., Robert E. Younger Abstract—Past research in deception detection at the University of Arizona has guided the investigation of con- cealment detection. A theoretical foundation and model for the analysis of concealment detection is proposed. The visual and verbal channels are the two avenues of conceal- ment detection studied. Several available test beds for vis- ual intent analysis are discussed and a proof-of-concept study exploring nonverbal communication within the con- text of concealment detection is shared. Additionally, two methods that may aid in verbally detecting deception dur- ing the interviews characteristic of secondary screening are introduced. Message feature mining uses message features or cues combined with machine learning techniques to clas- sify messages according to their deceptive potential. Speech act profiling, a method for quantifying and visualizing en- tire conversations, has shown promise in aiding deception detection. These methods may be combined and are in- tended to be a part of a suite of tools for automating decep- tion detection. I. INTRODUCTION afeguarding the homeland against deception and infiltration by adversaries who may be planning hos- tile actions poses one of the most daunting chal- lenges in the 21 st century. Achieving high information assurance is complicated not only by the speed, com- plexity, volume, and global reach of communications and information exchange that current information tech- nologies now afford, but also by the fallibility of humans in detecting hostile intent. All too often, the people pro- tecting our borders and public spaces are handicapped by untimely and incomplete information, overwhelming flows of people and materiel, and the limits of human vigilance. Moreover, the vulnerabilities posed by human agents are often exacerbated by the very same technolo- gies that enable amassing the glut of information. Portions of this research were supported by funding from the U. S. Air Force Office of Scientific Research under the U. S. Department of Defense University Research Initiative (Grant #F49620-01-1-0394) and by the U. S. Department of Homeland Security (Cooperative Agree- ment N66001-01-X-6042). The views, opinions, and/or findings in this report are those of the authors and should not be construed as official Department of Defense or Department of Homeland Security positions, policies, or decisions. J. K. Burgoon, M. Adkins, J. Kruse, M. L. Jensen, A. Deokar, D. P. Twitchell, and J. F. Nunamaker are with The Center for the Manage- ment of Information at The University of Arizona, Tucson, AZ 85721 USA (phone: 520-621-2640; fax: 520-621-2641; e-mails: jbur- [email protected], [email protected], [email protected], [email protected], adeo- [email protected], [email protected], jnuna- [email protected],). G. Tsechpenakis, S. Lu and D. Metaxas are with the Computational Biomedicine Imaging and Modeling Center at Rutgers University, New Brunswick, NJ 08854 USA (e-mails: [email protected], [email protected], [email protected]). R. Younger is with the Space and Naval Warfare Systems Center, San Diego, CA 92152 USA (e-mail: [email protected]) The interactions and complex interdependencies of in- formation systems and social systems render the problem difficult and challenging. We simply do not have the wherewithal to specifically identify every potentially dangerous individual around the world. Although com- pletely automating concealment detection is an appealing prospect, the complexity of detecting and countering hostile intentions defies a fully automated solution. A more promising approach is to integrate improved hu- man detection with automated tools that augment other biometric systems for behavioral analysis, the end goal being a system that singles out individuals for further scrutiny in a manner that reduces false positives and false negatives. Such an approach is needed to assist the transportation systems and border security personnel who have to counter high stake situations routinely. Transportation and border security systems have a common goal: allow law-abiding people to pass through checkpoints and detain those people with hostile intent. These systems employ a number of security measures aimed at accomplishing this goal. The methods and technologies described in this paper may prove to be useful in prescreening, primary screening, and secondary screening activities. The usefulness of any method of transportation secu- rity must be evaluated. The U.S. Federal Aviation Ad- ministration (FAA) has utilized nine criteria for evaluat- ing such systems. These criteria are aimed at ensuring the best methods and technologies are deployed in U.S. airports, and the methods reviewed in this paper are given a precursory evaluation based on the nine criteria. In this paper, we present our current research efforts in the direction of developing automated tools to identify S 1

Transcript of Detecting Concealment of Intent in Transportation Screening: A Proof of Concept

Detecting Concealment of Intent in Transportation Screening: a Proof-of-Concept

Judee K. Burgoon, Douglas P. Twitchell, Matthew L. Jensen, Mark Adkins, John Kruse, Amit Deokar, Gabriel Tsechpenakis, Shan Lu, Dimitris N. Metaxas, Jay F. Nunamaker Jr., Robert E. Younger

Abstract—Past research in deception detection at the

University of Arizona has guided the investigation of con-cealment detection. A theoretical foundation and model for the analysis of concealment detection is proposed. The visual and verbal channels are the two avenues of conceal-ment detection studied. Several available test beds for vis-ual intent analysis are discussed and a proof-of-concept study exploring nonverbal communication within the con-text of concealment detection is shared. Additionally, two methods that may aid in verbally detecting deception dur-ing the interviews characteristic of secondary screening are introduced. Message feature mining uses message features or cues combined with machine learning techniques to clas-sify messages according to their deceptive potential. Speech act profiling, a method for quantifying and visualizing en-tire conversations, has shown promise in aiding deception detection. These methods may be combined and are in-tended to be a part of a suite of tools for automating decep-tion detection.

I. INTRODUCTION afeguarding the homeland against deception and infiltration by adversaries who may be planning hos-tile actions poses one of the most daunting chal-

lenges in the 21st century. Achieving high information

assurance is complicated not only by the speed, com-plexity, volume, and global reach of communications and information exchange that current information tech-nologies now afford, but also by the fallibility of humans in detecting hostile intent. All too often, the people pro-tecting our borders and public spaces are handicapped by untimely and incomplete information, overwhelming flows of people and materiel, and the limits of human vigilance. Moreover, the vulnerabilities posed by human agents are often exacerbated by the very same technolo-gies that enable amassing the glut of information.

Portions of this research were supported by funding from the U. S.

Air Force Office of Scientific Research under the U. S. Department of Defense University Research Initiative (Grant #F49620-01-1-0394) and by the U. S. Department of Homeland Security (Cooperative Agree-ment N66001-01-X-6042). The views, opinions, and/or findings in this report are those of the authors and should not be construed as official Department of Defense or Department of Homeland Security positions, policies, or decisions.

J. K. Burgoon, M. Adkins, J. Kruse, M. L. Jensen, A. Deokar, D. P. Twitchell, and J. F. Nunamaker are with The Center for the Manage-ment of Information at The University of Arizona, Tucson, AZ 85721 USA (phone: 520-621-2640; fax: 520-621-2641; e-mails: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected],).

G. Tsechpenakis, S. Lu and D. Metaxas are with the Computational Biomedicine Imaging and Modeling Center at Rutgers University, New Brunswick, NJ 08854 USA (e-mails: [email protected], [email protected], [email protected]).

R. Younger is with the Space and Naval Warfare Systems Center, San Diego, CA 92152 USA (e-mail: [email protected])

The interactions and complex interdependencies of in-formation systems and social systems render the problem difficult and challenging. We simply do not have the wherewithal to specifically identify every potentially dangerous individual around the world. Although com-pletely automating concealment detection is an appealing prospect, the complexity of detecting and countering hostile intentions defies a fully automated solution. A more promising approach is to integrate improved hu-man detection with automated tools that augment other biometric systems for behavioral analysis, the end goal being a system that singles out individuals for further scrutiny in a manner that reduces false positives and false negatives. Such an approach is needed to assist the transportation systems and border security personnel who have to counter high stake situations routinely.

Transportation and border security systems have a common goal: allow law-abiding people to pass through checkpoints and detain those people with hostile intent. These systems employ a number of security measures aimed at accomplishing this goal. The methods and technologies described in this paper may prove to be useful in prescreening, primary screening, and secondary screening activities.

The usefulness of any method of transportation secu-rity must be evaluated. The U.S. Federal Aviation Ad-ministration (FAA) has utilized nine criteria for evaluat-ing such systems. These criteria are aimed at ensuring the best methods and technologies are deployed in U.S. airports, and the methods reviewed in this paper are given a precursory evaluation based on the nine criteria.

In this paper, we present our current research efforts in the direction of developing automated tools to identify

S

1

concealment and deception. The paper is organized as follows: Section II discusses the relationship between deception, concealment, internal state and behavior. Sec-tion III explains the model and the methodology we fol-low to identify concealment based on suspicion level. Sections IV and V discuss verbal and nonverbal methods for concealment detection respectively. Section VI de-scribes how these technologies might be used for avia-tion security, and Section VII evaluates the methods based on FAA criteria. Finally, Section VIII concludes and future research work is proposed.

II. DECEPTION, CONCEALMENT, INTENT AND BEHAVIOR Deception is defined as a message knowingly trans-

mitted with the intent to foster false beliefs or conclu-sions [1]. Over the past two and a half years, the Center for the Management of Information (CMI) at the Univer-sity of Arizona has conducted over a dozen experiments to study deception with over 2000 subjects [2-5]. These experiments have been instrumental in understanding the factors influencing deception, and have guided the build-ing of automated tools for detecting deception and the creation of training for security personnel [6, 7]. Since a person will most likely be deceptive about hostile inten-tions, research in deception detection has led to the ques-tion of whether or not concealed malicious intent can be inferred from cues in communication.

For those who guard our transportation systems, iden-tifying deception is a difficult daily task. Opportunities for travelers to deceive occur frequently in the screening process and it is the responsibility of those who monitor travelers to root out deceivers who may be engaged in illegal or terrorist activities. This task is made even more complicated by the brief interactions between agents and travelers, tremendous flow of people using transportation systems, and by the limits of human atten-tion.

The quest for the perfect lie detector or truth serum has been long and has resulted in only a few modest suc-cesses. The most common and probably most controver-sial method of deception detection is use of the poly-graph, commonly known as the “lie-detector test.” In a summary of laboratory tests, Vrij reports that the poly-graph is about 82% accurate at identifying deceivers [8]. The National Academy of Science, however, concluded that such experimental numbers are often overestimates of actual results, especially in personnel screening [9]. Although it is not admissible in court, the polygraph is useful in some investigations for identifying potential suspects. The problem is, however, the polygraph is a very invasive procedure and one that evokes fear in those subjected to it. Investigators must have a good reason for subjecting someone to a polygraph test and the subject must agree to take the test. Therefore, even

though the polygraph is relatively accurate compared to other methods, its invasive quality renders it useless in most everyday situations.

Other techniques, such as Criteria Based Content Analysis (CBCA) and Reality Monitoring (RM), are based on the content of interviews with subjects rather than the physiological arousal as with the polygraph. Because both of these methods, which are considered Statement Validity Analysis (SVA) methods, require an interview with the subject suspected of being deceptive, they are still intrusive, yet not as physically invasive as the body-sensor-addled polygraph. Both methods also require trained interviewers for conducting the interview and highly skilled analysts for reviewing the statements and reaching a judgment. Neither method provides im-mediate feedback. CBCA is based on what is known as the Undeutsch-Hypothesis [10, 11], which states that a statement derived from actual memory will differ in con-tent and quality from a statement derived from fantasy. CBCA uses a set of criteria to evaluate this hypothesis. Trained investigators rate a criminal statement against each criterion using a three-point scale. RM uses a list of criteria that overlaps somewhat with CBCA, but operates under a different hypothesis: truthful or real memories are likely to contain perceptual, contextual, and affective information while deceptions or fabrications are likely to contain cognitive operations (thoughts and meanings). In a face-to-face study of 73 nursing students, Vrij found that use of CBCA and RM to detect deception was suc-cessful at rates of 79.5% and 64.1% respectively [12].

Not all deception detection methods are invasive. Computerized Voice Stress Analysis (CVSA), for exam-ple, is a technique that analyzes voice pitch changes as a measure of arousal. The technique has shown to be roughly equivalent in accuracy to the polygraph [13], but as with the polygraph, this method will not be useful in situations where deception is not accompanied by physiological arousal.

Despite the research in face-to-face deception and the success of SVA in some studies, most people remain unable to detect deception in face-to-face media at a rate higher than chance [14]. Several possible reasons have been given for the lack of detection accuracy including truth bias, visual distraction, situational familiarity, and idiosyncratic behaviors that cloud true deception cues (See [14]:79-81, 98-99 for more detail).

Searching for deceptive cues in behavior has led us to examine one of the roots of deception, intensions that the sender wants to conceal. The intent of a person, whether benign or hostile, is closely tied to his or her internal state. Internal states may be manifest by any of a num-ber of behaviors. However, a single behavior could in-dicate a number of internal states as demonstrated in Fig.

2

1. The relationship between one behavior and multiple internal states renders the task of identifying conceal-ment of intent particularly challenging. Hence, our cur-rent research focuses on understanding these mappings and leveraging experience in the area of deception detec-tion to produce methods which identify whether the true intent of a person is being concealed.

Fig. 1 Relationship between concealment, internal state and behavior

III. THEORETICAL FOUNDATION Several theories and models offer useful perspectives

on the linkage between concealment and overt behav-ioral manifestations that elicit trust or suspicion. Three theories that are especially germane—interpersonal de-ception theory, expectancy violations theory, and signal detection theory—are integrated to produce a model of suspicious and trust-eliciting verbal and nonverbal communication. Additionally, we are developing a the-ory-guided taxonomy for clustering verbal and nonver-bal behaviors into appropriate groupings of suspicious and non-suspicious behavior in order to identify those who have the highest probability of concealed malicious intent.

Interpersonal deception theory (IDT) is a key theory for mapping behavioral cues into general behavioral characteristics of deception [15]. IDT depicts the proc-ess-oriented nature of interpersonal deception and the multiplicity of pre-interactional, interactional, and out-come factors that are thought to influence it. Among its relevant precepts is the assumption that deception is a strategic activity subject to a variety of tactics for evad-ing detection. It also recognizes the influence of receiver behaviors on sender displays, and it views deception as a dynamic and iterative process, a game of moves and countermoves that enable senders to make ongoing ad-aptations that further hamper detection. Consequently, a theory of suspicious and trust-eliciting behavior must take into account a variety of moderator variables, each of which may spawn a different behavioral profile.

Expectancy violations theory (EVT) is concerned with

what nonverbal and verbal behavior patterns are consid-ered normal or expected, what behaviors constitute vio-lations of expectations, and what consequences viola-tions create [16]. Its proponents contend that specific behavioral cues are less diagnostic than whether a sender’s behavior conforms to or violates expected be-havioral patterns and that receivers are more likely to attune to such violations. In other words, it is more use-ful to classify communication according to whether it includes behavioral anomalies, deviations from a base-line, or discrepancies among indicators. Behavioral pat-terns which include deviations and anomalies are pre-dicted to influence receiver judgments of credibility and deceit. The theory distinguishes between positive and negative violations. Positive violations may actually fos-ter perceived trustworthiness and credibility, whereas negative violations should foster suspicion. Expectancy violations theory is thus relevant to the process of com-paring the behavioral profiles against the expected norms.

The process of interpreting different verbal and non-verbal cues and clustering them together in the form of behavioral characteristics to contrast with the expected behavioral characteristics is non-trivial and challenging, considering the large variation in the behaviors of differ-ent human beings. The key segments of this dynamic process are the characteristics of actors, features of transmission channels, features of messages, and the information exchange process itself.

Finally, a threshold for deriving the level of suspicion or trust is based on signal detection theory (SDT). De-veloped by Green and Swets [17], SDT defines two sets of probabilities in a signal detection test, in which two possible stimulus types must be discriminated. In the context of intent identification, the two possible stimuli types are concealment and openness. If the actual intent is hostile and the output judgment is suspicion, the trial is a "hit." If the actual intent is benign and the output is judged suspicion, it is a "false alarm." If the actual intent is hostile but the judgment is one of trust, it is a "miss." Finally, if the actual intent is trustworthy and the judg-ment is one of trust, it is a correct decision as shown in Table I.

TABLE I

POSSIBLE JUDGMENTS FROM SDT Judgment Suspicion Trust

Hostile “Hit” “Miss” Conceal-ment Benign False Alarm Correct

Decision According to SDT, the output of such a binary test is

based on the value of a decision variable, which in the

3

context of concealment identification is the suspicion level. The threshold value of the decision variable is called the criterion. For humans, the selection of a crite-rion is not only related to the value of actual stimuli but also related to their psychological characteristics. In other words, the criterion is a function of perceived stim-uli, which, in the context of concealment detection, are the behavioral profile deviations. The SDT calculation methods described in [18] can be used to study the dis-tribution of the values of the suspicion level variable

across the behavioral profile deviations to determine the appropriate criterion for the final decision making.

We have integrated these multidisciplinary theories and models into a single systemic framework that guides our experimental work and tool development. The model is shown in Fig. 2. It is a decision model for judging how trustworthy an individual is on a trust-suspicion

spectrum, based on demonstrated behavioral cues. The actual intent of the individual can be considered as input for the model, which is demonstrated in the form of the behavioral cues, either verbal or nonverbal. These be-haviors include linguistic, content, meta-content, kinesic, proxemic, chronemic, and paralinguistic cues. The be-haviors are influenced by the interaction of sender and receiver actions, cognitions and their mutual influence.

Linguistic cues include features like word selection, phrasing, and sentence structure. An example is a person

demonstrating other-centeredness by consistently not referring to himself or herself. Content/Theme cues are taken from the meaning of the sender’s words. Meta-content cues are derived from the types of topics the content addresses. For example, Reality Monitoring is based on meta-content. Kinesic cuse are found in the way a person moves. Proxemic cues are determined from

Fig. 2 Model of concealment detection based on observed behavioral cues. The behavioral cues are compared to norms for that individual and to general norms. Deviations are noted and combined to produce a judgment on the suspicion of concealment.

4

the distance a person is to other people and other objects. For example, sitting in the back row of a meeting might indicate disinterest in the meeting. Chronemic cues con-cern a person’s use of time. For example, a person might establish dominance by arriving late to a meeting. Paralinguistic cues are obtained from the vocal channel.

The observed behavioral characteristics of the sender can be compared to the normal or expected characteris-tics stored in a repository. Unexpected deviations may indicate concealment of malicious intent.

First the immediate behavioral characteristics are compared with the individual’s historical characteristics across multiple episodes within a given context. When individual-level histories are not available, only the sec-ond set of expectations will be utilized. This second set of expectations is comprised of a general profile of ex-pected behavior across people within the same scenario. For example, when guilty suspects are questioned face-to-face, they may show a combination of verbal brevity, vocal tension, and over control of movement. The result is that such individuals typically look more tense, un-pleasant, aroused, and submissive than those with noth-ing to hide.

The deviation between the observed behavioral char-acteristics and the expected individual and group charac-teristics in either positive or negative directions indicates a suspicion level. By setting a proper threshold on this deviation measure and given a certain context, the auto-mated tool will thus be able to indicate the probability that the sender is suspicious or trustworthy.

IV. VERBAL CUES AND LINGUISTIC ANALYSIS Message feature mining and speech act profiling are

two methods for automated analysis of verbal interac-tions. Both of these methods, given good automatic speech recognition, have the potential to aid transporta-tion security by giving screeners, especially secondary screeners, feedback concerning potential concealment in security interactions. To work effectively in a transpor-tation security context such as airport screening, effec-tive speech recognition software would be required to deploy these verbal deception and concealment detection methods. We do not discuss automatic speech recogni-tion in this paper; however, we feel that current speech recognition technologies would be sufficient for the re-quirements of the verbal detection methods since neither require complete word recognition, and the probabilistic nature of both means that any speech recognition errors are simply (but, of course, undesirably) added to the total error of the system.

A. Message Feature Mining Message feature mining [7] is a method for classifying

messages as deceptive or truthful based on content-independent message features. It can be divided into two major steps, extracting features and classification, each with its own sub-steps. Table II is a summary of the procedure.

TABLE II

SUMMARY OF INTENT-BASED TEXT CLASSIFICATION PROCEDURE 1) Extract Features.

a) Choose appropriate features for deceptive in-tent.

b) Determine granularity of feature aggregation (i.e. sentence, paragraph, etc.).

c) Calculate features over desired text portions. 2) Classify.

a) Manually classify documents. b) Prepare data for automatic classification. c) Choose appropriate classification method. d) Train model on portion of data. e) Test model on remaining data. f) Evaluate results and modify features, granular-

ity, and/or classification method to improve re-sults.

1) Extracting Features

Extracting features includes choosing appropriate fea-tures for deception on which the messages will be classi-fied, determining the granularity of feature aggregation, and calculating the features on the desired text. Of these steps, the most difficult is choosing the appropriate fea-tures. Potentially there are an infinite number of possible features. Choosing those that are most appropriate for classifying deception or concealment requires knowl-edge of the deception domain. A number of general fea-tures have been identified and may be useful in many contexts. These features are discussed in Section 3) be-low.

2) Classifying Messages

Classifying the messages starts with manually classi-fying the messages in the training set, preparing data for automatic classification, choosing an appropriate classi-fication method, training and testing the model, and evaluating the results. Because unsupervised learning may or may not create clusters based on deception, mes-sage feature mining uses supervised learning and manual classification of the training and testing sets. Once the data set is manually classified, it needs to be cleaned and formatted for input into the machine learning algorithms.

After the data are ready for classification, an appro-priate classification method or set of methods must be chosen. There are a number of methods to choose from, each with its own advantages and disadvantages [19].

5

Furthermore, most machine learning methods have a number of parameters (such as number of hidden nodes in neural networks) that adjust the behavior of the mod-els, resulting in a very large number of possible models. Choosing a set of methods to use can be daunting; how-ever, some methods that seem to have withstood the test of time include inductive decision trees and neural net-works. After the method or set of methods is chosen, it is a simple task to train and test the data and obtain the accuracy results. Once obtained, the results can be used as a feedback tool for modifying the features, the granu-larity, and/or the classification methods in an effort to improve the results. 3) The Desert Survival Study

The Desert Survival study was designed with two pur-poses in mind: first to test message feature mining with a set of cues to deception and second to create a data re-pository for testing automated deception detection tools. To this end, the study utilized the Desert Survival Prob-lem [20], which provides an environment for group communication and produced a set of deceptive and truthful messages. This set of messages provided a test bed for determining deceptive cues and testing message feature mining for deception detection.

The Desert Survival Problem places groups of two in a situation where they must rank 12 items according to how important that item is to survive in the desert. Be-fore beginning the task, group members are given expert advice on how to survive in the desert and a member of the group is instructed to be deceptive. Group members discuss the items and come to a consensus on how to rank the items. The deceptive member is encouraged to change the group's consensus contrary to his or her own opinion. A more detailed explanation can be found in [3] and [2].

TABLE III

EXAMPLE FEATURES (ADAPTED FROM [3]) • Word quantity • Average sentence length [21] • Passive voice ratio

verbsof # total verbspassive of # total

• Emotiveness [21]:

verbsof # total nouns of # totaladverbs of # total adjectives # total

++

• Content word diversity:

rdscontent wo of # totalrdscontent wo unique of # total , where content

words primarily expresses lexical meaning (not function words).

The data consists of all of the messages sent by all of

the participants each day of the study. Each message is

considered a document and is classified as deceptive or truthful based on whether the participant was instructed to be deceptive. Table III gives the operational defini-tions for a sample of 5 of the 23 features used in the De-sert Survival study. All of the features are explained in [3].

4) Experimental Findings

Zhou et. al. [3, 22] used the Desert Survival problem in a study with groups of two. One of the subjects in some of the pairs was instructed to deceive his or her partner by recommending a ranking counter to their ac-tual opinion. Using the automated message feature min-ing technique the researchers were able to obtain ap-proximately 80% accuracy at detecting deceptive mes-sages and subjects—much better than the 50% baseline accuracy of guessing. Although the technology is not perfect, it has a number of possible uses. For example, in a situation where deception is suspected, large email archives could be searched for messages that exhibit deceptive cues, thereby reducing the investigators’ workload. As noted earlier, coupled with automatic speech recognition, it could provide an aid to interview-ers in secondary screening.

B. Speech Act Profiling Speech act profiling [23] is a method of analyzing and

visualizing conversations and participants behavior ac-cording to how they converse rather than the subject of the conversation. Since people may deceive in any do-main, it is useful to have an analysis technique that is domain independent. Speech act profiling provides a domain independent analysis of conversations by com-bining the concepts of speech act theory, automated speech act classification, and fuzzy logic.

Speech act theory posits that any utterance (usually a sentence) contains a propositional content part, c, and an illocutionary force, f [24]. The propositional content is the meaning of the words which created the utterance. For example, the statement it's cold in here has the pro-positional content that the room or area where the speaker is located is cold. The illocutionary force, how-ever, is the intent of the speaker's assertion that some-thing about the world is true. That is, the speaker is do-ing something by speaking, which in this case is assert-ing. Speakers can do many things with an utterance. They can assert, question, thank, declare, insult, order, and even make substantial changes in the world such as marry a couple or inaugurate an president. There might be more than one illocutionary force or speech act asso-ciated with an utterance, and the real act is determined by the context where the words are uttered. With the previous example, it's cold in here, if uttered by a gen-eral in the army to a private might be an order to turn up

6

the thermostat rather than just a simple statement. Thus, every utterance has several illocutionary act potentials, each dependent on the context. From here on, the illocu-tionary force or act potentials are labeled as speech acts.

Speech acts are important in deception detection for two reasons. First, they are the means by which decep-tion is transmitted; and second, they provide a mecha-nism for studying deception in conversations in a content independent manner. Deceptive speakers may express more uncertainty in their messages than truthtellers [25], and this uncertainty can be detected in the type of speech acts speakers use. For example, uncertain speakers should tend to use more opinions, maybe expressions, and questions than truthtellers.

TABLE IV

SELECTED SPEECH ACTS FOUND IN THE SPEECH ACT PROFILES (ADAPTED FROM [26])

Tag Name Tag Example STATEMENT-NON-OPINION sd I’m in the legal

department. ACKNOWLEDGE (BACKCHANNEL) b Uh-huh.

STATEMENT-OPINION sv I think it’s great

AGREE/ACCEPT aa That’s exactly it ABANDONED, TURN-EXIT OR UNINTERPRETABLE

% So,-

APPRECIATION ba I can imagine

YES-NO-QUESTION qy Do you have to have special training?

YES-ANSWER ny Yes.

WH-QUESTION qw Well, how old are you?

NO ANSWERS nn No. AFFIRMATIVE NON-YES ANSWER na It is.

MAYBE/ACCEPT-PART maybe Something like that

Speech acts are important and technically useful since

a method has been created to automatically identify them [27]. This method uses a manually annotated corpus of conversations to train n-gram language models and a hidden Markov model (HMM), which in turn identify the most likely sequence of speech acts in a conversa-tion. Using the principles of fuzzy logic, the probabilities from the HMM can be taken as degrees to which an ut-terance belongs to a number of fuzzy sets representing the speech acts. Speech act profiling aggregates these fuzzy sets together and subtracts from them a “normal” conversation profile (created from the training corpus) to

create a profile for an entire conversation. An example profile is shown in Fig. 3. Additionally, Table IV shows a selection of 12 of the speech acts from the total set of 42 used in [27].

(a)

(b)

(c)

Fig. 3 (a) Sample speech act profile showing submissive and uncertain behavior by the deceiver as indicated by the (b) greater number of MAYBE/ACCEPT-PARTs (maybe) and OPINIONs (sv) and (c) fewer STATEMENTs (sd)

Fig. 3 is a speech act profile created from all of the ut-terances from a single multi-player online game, Strike-Com [28]. One of the players, Space1, has been told to deceptively steer the group away from bombing the cor-rect locations (the goal of the game) as well as to conceal his intentions. In this particular game the profile indi-cates that the participant playing Space1 is uncertain compared to the other participants, Air1 and Intel1, as

7

indicated by the greater number of MAYBE/ACCEPT-PARTs (maybe) and OPINIONs (sv) as magnified in Fig. 3(b) and fewer STATEMENTs (sd) as magnified in Fig. 3(c). An example of this uncertain language is shown in the excerpt in Table V. Early in the game, Space1 hedges the comment “i got a stike on c2” with the com-ment “but it says that it can be wrong...” Later Space1 qualifies his advocacy of grid space e3 with “i have a feeling.” In reality there was no target at e3, and Space1 was likely attempting to deceive the others as instructed. In DePaulo et. al.'s meta-analysis of deception [25], vo-cal and verbal impressions of uncertainty by a listener were significantly correlated with deception. That is, when deception is present, the receiver of the deceptive message often notices uncertainty in the speaker's voice or words. Since the voice channel isn't available in chat, any uncertainty would have to be transmitted and de-tected using only the words. The uncertainty transmitted in the words is picked up by the profile in Fig. 3 in the form of a high proportion (relative to the other players) of MAYBE/ACCEPT-PARTs (maybe) and OPINIONs (sv) and a low proportion of STATEMENTs (sd).

TABLE V

EXCERPT OF CONVERSATION REPRESENTED BY THE SPEECH ACT PROFILE IN FIG. 3

Speaker Utterance Space1 i got a stike on c2. Space1 but it says that it can be wrong... ... .... Space1 i have a feeling theres on at e3... also , on

the next turn we need to check columns one and two.

1) Experimental Findings

To test speech act profiling's ability to aid in decep-tion detection, Twitchell et. al. [29] identified those speech acts that were related to uncertainty and summed their proportions for each participant in each conversa-tion. They found that the deceptive participants in the conversations were significantly more uncertain than their fellow players. This result shows that using speech act profiling as part of deception detection in text-based synchronous conversations is promising. Besides uncer-tainty, other groupings of speech acts could be tested such as dominant or submissive behavior, which has also been identified with deception [25].

When coupled with automated speech recognition, speech act profiling may also show promise while inter-viewing suspected transportation security threats (further detailed in Section VI). The profile in Fig. 4 shows a conversation from the SwitchBoard corpus that indicates an interview is occurring. Speaker B is questioning Speaker A as indicated by the greater than normal num-

ber of WH-QUESTIONS (qw) and YES-NO-QUESTIONS (qy) by Speaker B and the greater than normal number of STATEMENTS (sd) by Speaker A. If Speaker A were attempting to conceal malicious intent, the profile might reveal the concealment by showing higher proportion of MAYBE/ACCEPT-PARTs (maybe) and OPINIONs (sv) and a lower proportion of STATEMENTs (sd) than we see with this interview where no deception or concealment oc-curs.

Fig. 4 A speech act profile of an interview with no deception or con-cealment

C. Future Steps for Verbal Analysis Speech act profiling could be improved by using a set

of HMMs that represent concealment conversational patterns. Example conversations with each speech act manually annotated would be manually categorized, and the conversations from each category used to train the corresponding HMM in a process similar to the one cur-rently used in speech act profiling [23, 27]. New con-versations to be classified are run through each HMM. When one of the models produces an output probability beyond an empirically derived threshold, the conversa-tion could be labeled as containing concealment.

V. CONCEALMENT DETECTION FROM NONVERBAL CUES As with the verbal analysis, the analysis of nonverbal

behavioral characteristics of a sender requires a rich data set. The research team has compiled data from its past and current research projects to form test beds which are good examples of human behavior. The available data sets are briefly described below along with the experi-ment designs to illustrate their richness and potential for this research.

A. Behavioral Analysis Interview Data Set Because experimentally generated data often lack the

8

high motivation and severity of consequences found in real-world circumstances, we have obtained videotapes of criminal suspects being subjected to behavioral analy-sis interviews. Some interviewees were found guilty based on confession or by independent corroborating evidence, while others were exonerated. This test bed enables controlling for the degree of nervousness com-mon for anyone subjected to criminal interviews from the behavior patterns uniquely associated with criminal conduct. We believe that the criminals’ behavior will be a valid surrogate for concealing or deceptive behavior evidenced in other contexts.

B. Mock Theft Experiment Data Set The Mock Theft experiment [4, 5] was designed to re-

veal cues that can be used in detecting deception. In this experiment, some participants played the role of a thief while others where simply present during the theft. All participants were subsequently interviewed by untrained and trained interviewers via text chat, audio conferenc-ing, or face-to-face interaction. A companion observer study includes third-party assessments of senders’ trust-worthiness and thus can serve as a second form of inde-pendent verification of whether the interviewees’ lan-guage, content, and nonverbal behavior appeared trust-worthy or suspicious.

C. Airport Scenario Data Set Four actors were hired to assist in performing a proof-

of-concept study to determine the feasibility of identify-ing behavioral states from gestures and body movement. They participated in scenarios designed to simulate air-port screening procedures. The scenarios included seated interaction, standing interaction, queuing, and locomotion. Within each scenario, each actor was asked to demonstrate three states: relaxed, agitated, and over-controlled.

D. Machine Learning Training Set Seven employees of CMI created a set of videos used

to train machine learning tools to identify gestures. Twenty gestures involving the fingers, hands, arms, trunk and head were repeated 10-12 times by each par-ticipant.

E. Data Set Analysis In an effort to test the concept of automatically identi-

fying nonverbal cues that arouse suspicion, the CMI at the University of Arizona and the Center for Computa-tional Biomedicine, Imaging and Modeling (CBIM) at Rutgers University conducted a proof-of-concept study. This study investigated methods used in identifying agi-tated, relaxed, and over-controlled behaviors from non-verbal cues in a video segment. The results of this proof-of-concept study are illustrated below.

1) Location Estimation of the Head and Hands

Central to the recognition of nonverbal signals includ-ing individual gestures in video is the ability to recog-nize and track body parts such as the head and hands. This issue has been investigated in the past (see [30]) and CBIM's use of “blob analysis” provides a useful approach to examining human movement [31, 32]. Us-ing color analysis, eigenspace-based shape segmenta-tion, and Kalman filters, we have been able to track the position, size, and angle of different body parts with great accuracy. Fig. 5 shows a single frame of video which has been subjected to blob analysis. The ellipses in the figure represent the body parts' position, size, and angle.

Fig. 5 Blobs capture the location of the head and hands

Blob analysis extracts hand and face regions using the color distribution from an image sequence. A three-dimensional look-up-table (3-D LUT) is prepared to set the color distribution of the face and hands. This 3-D LUT is created in advance using skin color samples. After extracting the hand and face regions from an im-age sequence, the system computes elliptical “blobs” identifying candidates for the face and hands. The 3-D LUT may incorrectly identify candidate regions which are similar to skin color, however these candidates are disregarded through fine segmentation and comparing the subspaces of the face and hand candidates. Thus, the most face-like and hand-like regions in a video sequence are identified. From the blobs, the left hand, right hand and face can be tracked continuously. For a detailed description of this process, refer to [33].

From each blob, a number of measurements are re-corded for each frame in an image sequence. As demon-strated in Fig. 6, the center of the blob is captured as x and y coordinates. These coordinates are based on the pixels contained in each frame. Further, the lengths of the major and minor axes of the ellipse are recorded in pixels. Finally, the angle of the major axis of the blob is

9

recorded. Table VI contains a small example data stream from a single blob.

(x , y )c c

a

b θ

Fig. 6 An example of a blob that surrounds the hands and head [34]

TABLE VI MEASUREMENTS FROM THE HEAD BLOB. SIMILAR MEASUREMENTS

ARE COLLECTED FROM THE HAND BLOBS [34]

Frame X Y

Major Axis

Length

Minor Axis

Length

Major Axis

Angle 1 322 149 90 63 0.48 2 322 149 90 63 0.48 3 323 149 88 64 0.46 4 323 148 89 65 0.49

From positions and movements of the hands and face

we can make further inferences about the trunk and rela-tions to other people and objects. This allows the identi-fication of gestures, posture and other body expressions [33, 34].

Although methods based on color analysis offer a great deal of precision in tracking the head and hands, there are drawbacks to such an approach. First, the process is more time intensive than other gesture recog-nition methods and color analysis requires complex ini-tialization not found with other methods [35]. Further, the process can be disrupted by significant occlusion such as a subject wearing gloves.

Despite these drawbacks, color analysis provides much greater accuracy in estimating the location of body

parts than other methods. In a controlled setting such as indoors at an airport checkpoint, color analysis offers an effective foundation on which behavioral analysis is possible. 2) Determining Behavioral State from Gestures

Some behaviors associated with deception might be classified into two groups: agitation and over-control. Related to agitation are manifestations of nervousness and fear [36]. One example of nervous behavior is fidg-eting [37]. Although, the link between fidgeting and deception is still debated, dePaulo’s meta-analysis re-viewing numerous studies on deception found a signifi-cant relationship between undirected fidgeting and de-ception, although it questions the role of self-touches and object touches in predicting deception [25].

Liars may be aware of behavioral cues, such as fidg-eting, which might reveal their deception. In an effort to suppress deceptive cues and appear truthful, liars may overcompensate and dramatically reduce all behavior [8, 25]. Such tenseness and over-control can be seen in decreased head movements [38], leg movements [39] and hand and arm movements [12].

Using our model presented in Fig. 2, a baseline of be-havior was established for agitated, relaxed and over-controlled behavioral states using the airport scenario data set. The airport scenarios were then subjected to blob analysis and resultant data from each video frame, as well as the velocity of the hands’ movements, the frequency of the hands touching the face, and the fre-quency of the hands coming together were used to calcu-late the behavioral state associated with the movement.

Fig. 7 graphically displays some sample data taken from the one of the actors as well as sample frames from the video segments. The figure shows the change in position and velocity of the head and hands for each frame of video. In the agitated state, change in head and hands positions is rapid and frequent. In the controlled state, change in head and hands positions is slow and infrequent and the relaxed state shows moderate changes in position and velocity.

10

Fig. 7 Noticeable differences exist in the changes of position and velocities of the hands and head between the states.

The data captured in the blob analysis from the video clips from the airport scenarios was then used to calcu-late the behavioral state associated with the movement. The state is calculated in Equation (1).

State = [W1*F1+W2*(F2+F3)]*F0 (1) where F1 is the variance of the head velocity Vhead, i.e. F1 = var(Vhead), and Fi = var(Vhand(i))/ var(P hand(i)), i = 1,2, with Vhand(i) and Phand(i) indicating a hand’s velocity and position respectively.

Also, W1 and W2 are the weights with which head and hand parameters participate in the decision, and they are defined as:

)(var1

21headP

W = (2)

where Phead is the position of the head, and

)(1

2handhandfacehand ff

W−− +

= (3)

where fhand-face is the frequency of a hand touching the face and fhand-hand is the frequency of the hands touching each other.

The weights defined in Equations (2) and (3) have the following meaning. From our observations, we could not tell whether a subject is agitated or relaxed from the head movements, which are usually rapid in these cases.

Thus, when the head moves abruptly and often, we do not take it into consideration for our results. Also, the more often two blobs are merged into one, i.e. the more often the hands touch each other or a hand touches the face, the less information we have about the hand movements outside these events (time segments), and thus the respective positions and velocities are less use-ful.

Finally, the parameter F0 is used as a normalization factor,

facehand

facehand

Df

F−

−=0 (4)

where Dhand-face is the duration (number of frames) of the event “hand on face”. After normalization in the range between 0.0 and 1.0, we can obtain the rough es-timation of the state as shown in Table VII. [33].

TABLE VII

BEHAVIORAL STATES OF ACTORS

State State Values Controlled 0.0 < State < 0.2 Relaxed 0.2 < State < 0.7 Agitated 0.7 < State < 1.0

These thresholds were then tested on subjects from the

Mock Theft Experiment where interviewees displayed relaxed and over-controlled behavior. State was accu-

11

rately determined for two mock theft and three airport scenario interviewees that were tested using the state equation.

Clearly, automatically judging behavioral states is very difficult. While this proof-of-concept study is sim-plistic in calculating behavioral states from observed movement, it does show that such an approach may be possible. In order to gain more acceptable results, a more flexible model must be created for behavioral state determination and more features and cues should be in-cluded in the model.

3) Future Steps in Nonverbal Concealment Detection

While blob analysis may be a useful approach in de-tecting concealment, large hurdles exist for actual de-ployment of such a system. In order to analyze the be-havior of people and detect concealment in an actual transportation setting, a near real-time, automated system is necessary. In building a near real-time system, there are serious challenges to overcome such as video-rate processing and automatic detection and recovery from failures. Currently, the processing time of blob analysis reaches about 15 frames per second at a 320x240 resolu-tion. Sampling every other frame has been proposed as a solution; however, considering the improvement of com-puter technology, a faster processing rate may be ex-pected with time.

Another issue confounding the creation of a near real-time system is the considerable effort required to create the training skin samples. This task becomes even more onerous when dealing with large numbers of people that would be present in a public area such as an airport. This problem is currently being explored and a combina-tion of natural images and computer-synthesized sam-ples may be a possible solution to this issue.

In addition to technical issues in concealment detec-tion, obstacles exist in extracting behavioral cues which might indicate concealment. Although initially identi-fied cues that were described in Equation (1) show promise, efforts are currently underway to investigate additional kinesic cues which may be useful in automati-cally identifying concealment. These efforts are based on work from the psychological and communication disciplines and match reliable deceptive cues with fea-tures that can be identified in blob analysis. Promising cues that can be automatically identified are location of gestures in relation to the body and total amount of ges-tural activity.

Equation (1) could be replaced with a two-layer HMM learning system. The first HMM layer learns to recog-nize the gestures of interest, and the output from that layer is input to the second HMM layer, which classifies the behavioral state. The use of HMMs overcomes the

problems of parameter tuning, thresholding and video segmentation issues that are present when using Equa-tion (1).

VI. APPLICATION TO TRANSPORTATION SECURITY Both verbal and nonverbal methods for concealment

and deception detection have the potential to be useful in transportation security. In airport and border screening at least one instance of face-to-face verbal interaction occurs. With adequate speech recognition—which is continuously improving—message feature mining and speech act profiling along with other methods such as voice stress analysis could be combined to aid primary screeners in detecting those persons attempting to con-ceal hostile intent. However, since relatively little is often said in a primary screening, the verbal methods should have greater success in secondary screening where an interview-style conversation occurs. The non-verbal methods described above could also aid during the primary and secondary screening activities, but, ad-ditionally, could be of great service during pre- and post-screening surveillance.

A. The Airport Scenario A possible application for these methods is in an air-

port scenario. Aviation security is perhaps the most fa-miliar transportation security scenario. Most security systems are implemented in levels. No single level of security is expected to apprehend all attempts at a secu-rity breach. Instead, each level lowers the probability that a breach will occur. Aviation security operates on this principle. The layers begin with law enforcement agencies searching out and apprehending potential threats to aviation security. They end with such meas-ures as reinforced cockpit doors, flight crew training, and professional air marshals. Between these layers lies security in the airport itself.

Fig. 8 depicts some of the major layers involved in passenger screening at airports. The first layer, ticket purchase, might occur inside or outside the airport, and the transaction sometimes occurs with a verbal ex-change. This verbal exchange has the potential to be subject to analysis; however, any analysis would likely only be able to be done post hoc since training ticket agents to use a deception detection system would be extremely challenging. In the U.S. after the ticket pur-chase, passenger information obtained through the ticket purchase is sent to a system called the Computer As-sisted Passenger Prescreening System (CAPPS) [40]. This system checks the passenger information against watch lists of known threats to U.S. security and assigns the travelers with security risk ratings based on confi-dential criteria. Some of the information sent to CAPPS

12

is also collected at check-in time, the next layer. Check-in provides another layer where passengers are

screened using CAPPS and verbal analysis could but is unlikely to be used for the same reasons as the ticket purchase layer. Nonverbal analysis, however, could begin at this point. Cameras placed around the airport could track the movements of individuals at most layers of the security process. Surveillance while travelers are waiting immediately before screening could provide the best opportunity for identifying concealment using blob analysis. At this point, travelers are slowed and often stopped in their progress toward their gate, reducing the need to track forward movement and allowing a con-trolled environment. Screening areas could be designed so that only one person is in the field of view of the cam-era at one time. Additionally, the anxiety of being screened may elicit more of the behaviors of interest from a suspect than at other areas of the airport. Al-though pre-screening may be the most productive area, surveillance and nonverbal analysis could continue through the remainder of the process until boarding.

Fig. 8 Airport security layers. Bolded boxes indicate layers amenable to nonverbal analysis. Double-bordered boxes indicate layers that could be subject to both verbal and nonverbal analysis.

Unlike the verbal analysis, nonverbal analysis does

not require the airline or security agents to interact with the concealment detection system. Instead, trained ex-perts in a control room could use the system to create alerts when suspicious activity occurs, which would re-quire further manual analysis to ascertain any threat.

Recognizing deception becomes even more critical when a suspicious individual has been identified during primary screening, which may happen at a ticketing

counter or metal detector or through preliminary verbal or nonverbal analysis. Often such an individual is asked to undergo some form of secondary screening. The sec-ondary screening usually takes the form of an interview-style conversation where the suspicious individual is asked several pointed questions. During the interview, the agent must decide the validity of the individual’s responses and whether the individual should be allowed to proceed. Because even trained interviewers have dif-ficulty detecting deception it would be useful for the interviewer to be augmented with unbiased feedback concerning the deceptive potential of the interviewees.

Although the verbal methods are being developed us-ing transcribed interviews, they could utilize speech rec-ognition software for real time interview analysis as shown in Fig. 9. At primary screening, potential inter-viewees are screened using methods such as metal detec-tors, brief questioning, or nonverbal concealment detec-tion. During secondary screening if the interviewee doesn't admit guilt to any offense, the interviewer must determine whether the interviewee is being deceptive. There are a number of interviewing methods used by law enforcement and others to attempt to detect deception. These interviewing methods are augmented by a com-puterized deception detection system. The system uses automatic speech recognition to convert the speech into usable text. The text is then run through a number of deception detection algorithms which can aid the inter-viewer in determining deception. For example, if the interviewee attempts to equivocate in response to the questions posed, a deception detection algorithm could detect the uncertainty expressed and alert the inter-viewer. The interviewer could then use this information to pursue a more extensive line of questioning than he or she would have done otherwise. The nonverbal blob analysis could be used in the same fashion—analyzing behavior and providing feedback to the trained inter-viewer.

13

Fig. 9 Process of augmenting verbal deception detection in the secon-dary screening portion of Fig. 8. Suspect travelers are referred to sec-ondary screening where they are interviewed. The audio stream from the interview is captured, converted to text and fed into the conceal-ment detection system. The system then gives feedback to the inter-viewer.

The security scenario at border crossings is similar to the airport scenario. Instead of ticketing, check-in, screening, and boarding, border crossing typically only has primary and secondary screening. The verbal and nonverbal analyses could be used during these screen-ings and nonverbal analysis could be used during pre- and post-screening. Indeed, as mentioned in Section VIII, an experiment is currently being mounted to test blob analysis at a border crossing.

VII. EVALUATION The FAA has established a set of criteria for evaluat-

ing technologies for aviation security [41]. These nine criteria were developed over a number of years with assistance from several private consulting firms. A use-ful exercise is to evaluate the methods presented in this paper against the FAA criteria. The nine criteria are (1) technical credibility, (2) comprehensiveness, (3) useful-ness, (4) usability, (5) ease of implementation, (6) flexi-bility, (7) applicability, (8) subjective judgments, and (9) total life-cycle cost. Definitions of these criteria are cited in each section [41]. Most of the criteria cannot be fully applied until an actual system exists. Despite this limitation the methods are rated as far as possible against each of the criteria. Currently, the deception and con-cealment detection methods described in this paper are not developed enough to create a deployable system.

A. Technical Credibility The degree to which the methodology is technically sound and provides reproducible results. This criterion reflects the inherent scientific and technical quality of

the methodology and the level of technical credibility in its application to the assessment problem. This in-cludes proper treatment of uncertainty

Although some laboratory experiments have been per-

formed, neither of the methods described in this paper have been adequately tested in the field to ensure they are technically sound and create reproducible results; however, as described, they show great promise. On the other hand, the methods are grounded in respected and thoroughly tested theories.

B. Comprehensiveness The degree to which the methodology addresses impor-tant dimensions of the airport security planning proc-ess: evaluating countermeasure tradeoffs, risk assess-ment, and cost/benefit analysis

Issues such as countermeasure tradeoffs, risk assess-

ment, and cost/benefit analysis have not yet been ad-dressed with the concealment detection methods. These will need to be addressed in the future.

C. Usefulness

The degree to which the outputs of the proc-ess/methodology are (a) understandable, (b) meaning-ful, and (c) useful for making decisions.

Both the verbal and nonverbal methods provide out-

puts that are understandable, meaningful, and useful for making decisions. The intent of a system implementing message feature mining, speech act profiling, and blob analysis is to provide the user with probabilities that a person is deceptive or is concealing malicious intent. This information could be used to further question a sus-pected security threat.

D. Usability

The concept of usability encompasses three dimen-sions: (a) user skill level needed to successfully apply the methodology, (b) adequacy of the user interface, and (c) ease of data entry for and application of the model to generate results.

For those using concealment and deception detection

in primary screening, the training required to use the methods would be minimal. The system should auto-matically recommend to primary screeners when to refer a traveler to secondary screening. Secondary screeners would be required to have a higher skill level since, as indicated in Fig. 9, the interviewer would have to skill-fully use the information from the system to attain the highest accuracy.

14

E. Ease of Implementation

The degree of intrusiveness and/or difficulty involved in gathering required input information. There are two dimensions: (a) the availability of the required infor-mation and (b) if available, the time involved in gather-ing and preparing the required information.

The information needed to make these systems work

is not difficult to gather. Cameras, microphones, CPUs, and user interfaces are the equipment required. Besides secondary screening, the behaviors observed are public behaviors and no privacy intrusion is needed.

F. Flexibility

This criterion addresses issues related to the safety of the traveling public. It is measured by the degree to which the methodology could accommodate (a) the range of airport sizes, configurations, and complexities and (b) the range of threat scenarios of interest.

Conceivably, these technologies could be imple-

mented in almost any airport configuration. Higher ac-curacies may be obtained in the visual analysis if changes are made to the screening and prescreening ar-eas, but these changes would not be drastic or expensive. The verbal and nonverbal methods are mostly suitable against security threats who follow the normal passenger movement routine in an airport. It is not useful against those who would infiltrate secure areas or attempt to get inside through employment.

G. Applicability

The degree to which the methodology is applicable to other operational, but not public-safety related, secu-rity interests - theft (e.g., baggage theft, pilferage, or concession theft) and criminal activities.

These methods are not particularly suitable for non-

public safety related interests such as theft. They could be used for interviews of suspected thieves, but the benefits of using this kind of system may not be worth the implementation efforts.

H. Subjective Judgment

The degree to which subjective judgments are used in the VA process.

Subjective judgments are part of the concealment de-

tection methods. The system would likely be designed so most subjective judgments are limited to secondary screening, but maximization of accuracy requires that human judgment be a part of the process.

I. Total Life-Cycle Cost

The total life cycle cost for three years. There are three components: hardware, software, and technical sup-port.

Most costs would be derived from the training re-

quired to allow the security personnel to operate the sys-tem.

VIII. CONCLUSION The concealment detection methods described in this

paper should only be part of a comprehensive system for preventing hostile threats to transportation systems spe-cifically and to the homeland generally. Each part of the system represents a layer of security that reduces the probability of threatening actions. As part of an inte-grated system, concealment detection as described here should be tested both independently and together with other security measures.

A. Future Steps It is evident that the concealment detection methods

are not developed enough to provide an adequate scien-tific or operational evaluation. Much needs to be done to improve and to validate the concealment detection model and methods.

The development of a fusion engine to combine the nonverbal cues with previously identified text-based and audio indicators is one way to strengthen reliability in concealment detection and is depicted in the model in Fig. 2. The data streams from each type of indicator of concealment, both verbal and nonverbal, will be fed into a fusion engine and the engine will merge the probabili-ties of malicious concealment based on the weight of reliability for each method.

One issue in field testing any concealment detection system in airports is the relative rarity of perpetrators engaged in concealment with malicious intent. Border crossings in the U.S., on the other hand, are replete with offenders attempting to smuggle narcotics or themselves across the border. Numerous arrests are made each day at U.S. borders. This high level of criminal activity represents a potentially valuable data collection opportu-nity for studying concealment. To this end, it would be useful to conduct studies and establish baselines for spe-cific behaviors on border crossings. The data gathered would be extremely rich in behavioral cues and would provide another ecologically valid test bed where sub-jects should have high motivation due to serious possible consequences. The lessons learned in the border context could then be transferred to the airport context where criminal activities are less abundant, but potentially more dangerous.

15

With data from contextually valid sources such as in-terviews with people who have high motivation to de-ceive, one could investigate cues that security and law-enforcement officers use to determine probability of concealed hostile intent. Rich data sets may also be available for training and testing machine learning tools in applicable settings where contextual constraints such as lighting, space and equipment issues are present.

Although the idea of attaining high information assur-ance by automatically detecting deception and conceal-ment seems appealing, a much more realistic goal is the development of a tool to assist humans in their judgment of these behaviors. The creation of a tool is possible through adherence to a theoretically-based model and use of realistic data sets. Although the proof-of-concept study presented here is a small first step, our approach shows promise in understanding the detection of con-cealment.

History has shown that transportation systems are par-ticularly vulnerable to security threats, and although much has been done to mitigate threats to the transporta-tion system, still more can be done. Concealment of physical objects has been and continues to be a major priority, but concealment of intent is an area that may also be fruitful for increasing security. Concealment detection focusing on behavioral characteristics auto-matically tracked using blob analysis, message feature mining, and speech act profiling could be effective means for adding additional transportation security.

REFERENCES [1] D. Buller and J. Burgoon, "Interpersonal deception theory,"

Communication Theory, vol. 6, pp. 203-242, 1996. [2] L. Zhou, J. K. Burgoon, J. F. J. Nunamaker, and D. P. Twitchell,

"Automated linguistics based cues for detecting deception in text-based asynchronous computer-mediated communication: An em-pirical investigation," Group Decision and Negotiation, vol. 13, pp. 81-106, 2004.

[3] L. Zhou, D. P. Twitchell, T. Qin, J. K. Burgoon, and J. F. Nuna-maker Jr., "An Exploratory Study into Deception Detection in Text-Based Computer-Mediated Communication," presented at Thirty-Sixth Annual Hawaii International Conference on System Sciences (CD/ROM), Big Island, Hawaii, 2003.

[4] J. K. Burgoon, J. P. Blair, and E. Moyer, "Effects of Communica-tion Modality on Arousal, Cognitive Complexity, Behavioral Control and Deception Detection During Deceptive Episodes," presented at Annual Meeting of the National Communication As-sociation, Miami Beach, Florida, 2003.

[5] J. K. Burgoon, J. P. Blair, T. Qin, and J. F. Nunamaker, "Detect-ing Deception Through Linguistic Analysis," presented at NSF/NIJ Symposium on Intelligence and Security Informatics, 2003.

[6] J. George, D. P. Biros, J. K. Burgoon, and J. Nunamaker, "Train-ing Professionals to Detect Deception," presented at NSF/NIJ Symposium on "Intelligence and Security Informatics", Tucson, AZ, 2003.

[7] M. Adkins, D. P. Twitchell, J. K. Burgoon, and J. F. Nunamaker Jr., "Advances in Automated Deception Detection in Text-Based Computer-Mediated Communication," presented at Proceedings

of the SPIE Defense and Security Symposium, Orlando, Florida, 2004.

[8] A. Vrij, Detecting Lies and Deceit: the psychology of lying and implications for professional practice. Chichester: John Wiley & Sons, 2000.

[9] B. o. B. Committee to Review Scientific Evidence on the Poly-graph, Cognitive, and Sensory Sciences and Committee on Na-tional Statistics, Division of Behavioral and Social Sciences and Education, National Research Council of the National Acad-emies., "The polygraph and lie detection." Washington, D.C.: Na-tional Academies Press, 2003.

[10] M. Steller and G. Köhnken, "Criteria-Based Content Analysis," in Psychological methods in criminal investigation and evidence, D. C. Raskin, Ed. New York: Springer-Verlag, 1989, pp. 26-181.

[11] U. Undeutsch, "The development of statement reality analysis," in Credibility Assessment, U. Undeutsch, Ed. Dordrecht, The Neth-erlands: Kluwer, 1989, pp. 101-121.

[12] A. Vrij, K. Edward, K. P. Roberts, and R. Bull, "Detecting deceit via analysis of verbal and nonverbal behavior," Journal of Non-verbal Behavior, vol. 24, pp. 239-263, 2000.

[13] R. G. Tippett, "A Comparison Between Decision Accuracy Rates Obtained Using the Polygraph Instrument and the Computer Voice Stress Analyzer in the Absence of Jeopardy," vol. 2003: Florida Department of Law Enforcement, 1994.

[14] G. R. Miller and J. B. Stiff, Deceptive Communication. Thousand Oaks, CA: Sage Publications, 1993.

[15] J. K. Burgoon, D. B. Buller, L. K. Guerrero, and W. A. Afifi, "Interpersonal deception: XII, information management dimen-sions underlying deceptive and truthful messages," Communica-tion Monographs, vol. 63, pp. 50-69, 1996.

[16] J. K. Burgoon, "A communication model of personal space viola-tions: Explication and an initial test," Human Communication Re-search, vol. 4, pp. 129-142, 1978.

[17] a. S. J. A. Green D. M., Signal Detection Theory and Psychophys-ics: New York: Wiley, 1966.

[18] H. Stanislaw, Todorov, N, "Calculation of signal detection theory measures," Behavior Research Methods, Instruments, & Com-puters, vol. 31, pp. 137-149, 1999.

[19] T. M. Mitchell, Machine Learning. New York: McGraw-Hill, 1997.

[20] J. C. Lafferty and P. M. Eady, The Desert Survival Problem. Plymouth, Michigan: Experimental Learning Methods, 1974.

[21] J. K. Burgoon, M. Burgoon, and M. Wilkinson, "Writing Style as Predictor of Newspaper Readership, Satisfaction and Image," Journalism Quarterly, vol. 58, pp. 225-231, 1981.

[22] L. Zhou, D. P. Twitchell, T. Qin, J. K. Burgoon, and J. F. Nuna-maker Jr., "Toward the Automatic Prediction of Deception - An empirical comparison of classification methods," Journal of Management Information Systems, vol. 20, pp. 139-166, 2004.

[23] D. P. Twitchell and J. F. Nunamaker Jr., "Speech Act Profiling: A probabilistic method for analyzing persistent conversations and their participants," presented at Thirty-Seventh Annual Hawaii International Conference on System Sciences (CD/ROM), Big Is-land, Hawaii, 2004.

[24] J. R. Searle, "A Taxonomy of Illocutionary Acts," in Expression and Meaning: Studies in the Theory of Speech Acts. Cambridge, UK: Cambridge University Press, 1979, pp. 1-29.

[25] B. M. DePaulo, B. E. Malone, J. J. Lindsay, L. Muhlenbruck, K. Charlton, and H. Cooper, "Cues to Deception," Psychology Bul-litin, vol. 129, pp. 75-118, 2003.

[26] D. Jurafsky, E. Shriberg, and D. Biasca, "Switchboard SWBD-DAMSL Shallow-Discourse-Function Annotation Coders Man-ual, Draft 13." http://www.colorado.edu/ling/jurafsky/ws97/manual.august1.html, 1997.

[27] A. Stolcke, K. Reis, N. Coccaro, E. Shriberg, R. Bates, D. Juraf-sky, P. Taylor, C. Van Ess-Dykema, R. Martin, and M. Meteer, "Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech," Computational Linguistics, vol. 26, pp. 339-373, 2000.

16

[28] D. P. Twitchell, K. Wiers, M. Adkins, J. K. Burgoon, and J. F. J. Nunamaker, "StrikeCOM: A Multi-Player Online Strategy Game for Researching and Teaching Group Dynamics," presented at Hawaii International Conference on System Sciences (CD/ROM), Big Island, Hawaii, 2005.

[29] D. P. Twitchell, J. F. Nunamaker Jr., and J. K. Burgoon, "Using Speech Act Profiling for Deception Detection," presented at Lec-ture Notes in Computer Science: Intelligence and Security Infor-matics: Proceedings of the Second NSF/NIJ Symposium on Intel-ligence and Security Informatics, Tucson, Arizona, 2004.

[30] D. M. Gavrila, "The Visual Analysis of Human Movement: A Survey," Computer Vision and Image Understanding, vol. 73, pp. 82-98, 1999.

[31] K. Imagawa, S. Lu, and S. Igi, "Color-Based Hands Tracking System for Sign Language Recognition," presented at Proceed-ings of 3rd International Conference on Automatic Face and Ges-ture Recognition, 1998.

[32] S. Lu, D. Metaxas, D. Samaras, and J. Oliensis, "Using Multiple Cues for Hand Tracking and Model Refinement," presented at IEEE CVPR 2003, Madison, Wisconsin, 2003.

[33] S. Lu, G. Tsechpenakis, D. N. Metaxas, M. L. Jensen, and J. Kruse, "Blob Analysis of the Head and Hands: A Method for De-ception Detection," presented at Thirty-Eighth Annual Hawaii In-ternational Conference on System Sciences, Big Island, Hawaii, 2005.

[34] J. K. Burgoon, M. Adkins, J. Kruse, M. L. Jensen, A. Deokar, D. P. Twitchell, S. Lu, D. N. Metaxas, J. F. Nunamaker Jr., and R. E. Younger, "An Approach for Intent Identification by Building on Deception Detection," presented at Thirty-eighth annual Hawaii International Conference on System Sciences, Big Island, Hawaii, 2005.

[35] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, "Pfinder: Real-Time Tracking of the Human Body," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, pp. 780-785, 1997.

[36] P. Ekman, Telling Lies. New York: W. W. Norton & Company, 1985.

[37] M. Zuckerman, B. M. DePaulo, and R. Rosenthal, "Verbal and nonverbal communication of deception," in Advances in Experi-mental Social Psychology, L. Berkowitz, Ed. New York: Aca-demic, 1981, pp. 1-59.

[38] D. Buller, J. Burgoon, C. White, and A. Ebesu, "Interpersonal Deception: VII. Behavioral Profiles of Falsification, Equivocation and Concealment," Journal of Language and Social Psychology, vol. 13, pp. 366-395, 1994.

[39] P. Ekman, "Lying and Nonverbal Behavior: Theoretical Issues and New Findings," Journal of Nonverbal Behavior, vol. 12, pp. 163-176, 1988.

[40] "Aviation Security: Computer-Assisted Passenger Prescreening System Faces Significant Implementation Challenges," United States Government Accountability Office. GAO-04-385.

[41] R. Lazarick, "Airport Vulnerability Assessment - A Methodology Evaluation," presented at IEEE 33rd Annual International Con-ference on Security Technology, 1999.

17