The "private function" of gesture in second language speaking activity: a study of motion verbs and...

© Blackwell Publishing Ltd. 2004

The “private function” of gesture w 113

© Blackwell Publishing Ltd. 2004, 9600 Garsington Road, Oxford OX4 2DQ, UK and350 Main Street, Malden, MA 02148, USA

International Journal of Applied Linguistics w Vol. 14 w No. 1 w 2004

The “private function” of gesture in secondlanguage speaking activity: a study ofmotion verbs and gesturing in Englishand Spanish

Eduardo Negueruela University of Massachusetts, AmherstJames P. Lantolf, Stefanie Rehn Jordan and Jaime GelabertPennsylvania State University

This article examines Slobin’s concept of thinking for speaking (TFS) in thegesture/speech interface of advanced L2 speakers of English and Spanish.The focus is on the use of motion verbs in the respective languages. English, asatellite-framed language, encodes manner of motion in the verb and indicatespath of motion on satellite phrases (e.g. The frog leaped out of the boy’s pocket).Spanish, a verb-framed language, encodes path, and only rarely manner ofmotion, in the verb. If manner is encoded at all, it is done either throughlexical phrases (e.g. Tarzan saltó de liana a liana ‘Tarzan jumped from vine tovine’) or gesture. Using McNeill’s notion of growth point, the study suggeststhat L2 speakers, even at advanced levels, have difficulties manifesting L2 TFSpatterns and continue to rely on the patterns internalized in their L1. Shiftingfrom an L1 to an L2 TFS is particularly vexing for the L1 English speakers,because their L1 is richly endowed with manner verbs, while Spanish, theirL2, is not. Spanish L1 speakers in L2 English, on the other hand, can rely onthe English equivalents of basic manner verbs in Spanish. The analysis alsosuggests the need for reconsidering how manner verbs are categorized.

Introduction1

The analysis of the interface between speech and gesture has gained increasingprominence for researchers working in communication studies, psychology,psycholinguistics and second language learning. The leading researcher onthe gesture–speech interface, David McNeill (1992, 2000a,b, 2002), describesfour approaches to the integrated study of speaking and gesturing: gesturesin interpersonal communication; gestures for cognitive regulation; a computa-tional approach to understanding gesture–speech performance; and thetransition from gesticulation to sign. Recently, McNeill and his associateshave opened up an especially fruitful area of research in which gestures and


114 w Eduardo Negueruela et al.

speech are studied within Slobin’s (1996a,b, 2003) ‘thinking for speaking’(henceforth TFS) framework, a localistic version of the linguistic relativityhypothesis (details below). In this article, we report on a study aimed atinvestigating the TFS patterns among proficient L2 speakers. Specifically,we examine the gesture–speech interface with respect to the manner andpath constructions deployed by proficient L2 speakers of Spanish and Englishas they attempt to relate a narrative in their L2, and we compare theirperformance to that of L1 speakers of the respective languages. Althoughour study was inspired by the earlier research of Stam (2001) on L2 TFS, forreasons we will explain later, it is not intended as a replication of her work.Unlike Stam, we find little evidence that proficient L2 speakers use L2 TFSpatterns but that they rely instead on their native language patterns.

Gesture

In popular usage, gesture refers to the conventional manual movements thatoptionally co-occur with speech, such as waving to indicate leave-taking, orrotating the index finger near one’s temple to indicate that someone iscrazy. These behaviors, however, represent only one type of gesture, whichMcNeill calls emblems.2 Another popular understanding of gesture includespantomime, in which the entire body may be involved to express meaning.Unlike emblems, however, which may or may not co-occur with speech, useof speech during pantomime is generally proscribed (McNeill 2000a: 3). Thefocus of our research is on gesticulation, which, according to McNeill (1992),is a specific type of gesture that occurs simultaneously with speech and neveroccurs in its absence. The movements manifested in gesticulation are notconventionalized, and the speaker is usually unaware of their production.More importantly, the meanings imparted through this communicative systemare expressed globally and synthetically rather than analytically, as in thecase of verbal language, and they serve to amplify rather than reproduce themeanings expressed in speech (McNeill 2000a).3 Two other types of gesturesthat have been studied, but which we will not deal with in the present article,are deictic gestures, used to single out individuals, objects, locations, or even“unseen, abstract, or imaginary things”, and beats or motor gestures, whichare connected to the prosodic properties of speech rather than to its semanticcontent (Krauss, Chen and Gottseman 2000: 263). Given that our focus is ononly one type of gesture (i.e. gesticulation), for convenience we will use theterm ‘gesture’ instead of ‘gesticulation’ to refer to this type of non-verbalactivity.

Vygotsky (1978) provides some important insights into the psychologicalrelevance of gesture. In his research, he observed the ways in which children’sgesturing, writing, and drawing are linked in their development. He statesthat in play it is “the child’s self-motion, his own gestures that assigns thefunction of sign to the object and gives it meaning” (Vygotsky 1978: 108).



Pushing this insightful observation a bit further, we propose that in languageactivity, it is not the artifact (language) per se that assigns meaning toutterances but rather the motions of the speaker. Though this claim needsfurther elaboration, it presupposes two important premises: first, thatspeaking activity needs to consider language as embodied (McNeill andDuncan 2000) and, secondly, that gestures not only complement the meaningof language, but they also reveal the significance of an utterance for thespeaker. In other words, gestures emerge in genuine creative communicativeactivities, such as story telling, which, according to Vygotsky (1978), areontogenetically tied to play.

Gesture and cognition

According to McNeill (2000), speaking and gesturing form a unit that mustbe analyzed as a whole or as what he calls a growth point. Even thoughspeech and gesture are physically different phenomena, they neverthelessform a single functional system (Luria 1973) for meaning-making activity. AsMcNeill argues,

utterances do not emerge fully clothed in their linguistic garb; ratherthey develop from a primitive stage in deep time. This primitive stagedialectically integrates into a single unit, properties of thinking that havecontrary characteristics: imagistic and linguistic; idiosyncratic and social,global and segmented, holistic and analytic. (1992: 220, italics in original)

Reflecting Vygotsky’s reasoning on the importance of a unit of analysisfor the study of mental activity, McNeill proposes that despite their dialecticinstability, growth points “must retain properties of the whole” (McNeill1992: 220). The growth point of an utterance combines into a single meaningsystem “two distinct semiotic architectures”, and because each componentof the unit possesses “unique semiotic properties”, each can surpass “themeaning possibilities of the other” (McNeill and Duncan 2000: 144). For thisreason gesture provides “an enhanced window into mental processes” (ibid.).Thinking, according to the Growth Point Hypothesis, “is both global andsegmented, idiosyncratic and linguistically patterned” (McNeill and Duncan2000: 148). The growth point, which meshes nicely with how Vygotsky (1987)understood inner speech (see below), can be observed directly at that pointin utterance production where the gesture stroke and relevant linguisticsegment co-occur (McNeill 1992: 221).

As McCafferty points out in his article on gesture research in this issue,the function of gesturing has not been without its controversies. GivenMcCafferty’s thorough review of this work, we see no need to review it here.We do want to make the point, however, that there is no reason why gesturecannot serve both a communicative as well as cognitive function; that is, to



facilitate lexical access. Indeed, given our theoretical stance, grounded as itis in the writings of Vygotsky and his colleagues, the cognitive function ofgesture, as is the case with speaking, is derived from and predicated on itscommunicative function.

Vygotsky’s school of psychology rejects the dualistic perspective onlanguage and thought reflected in the container metaphor of mind (Reddy1983) and telementational metaphor of language (Harris 2002) and adopts amonistic stance that recognizes the necessary and inseparable dialectical unitybetween psychological processing and communicative activity (Leontiev 1981).This does not mean that study of perception and production processes, thetraditional grist of the psycholinguist’s mill, is not relevant, but in itself itsays nothing about the central role communication plays in thinking activity.The communicative principle (Leontiev 1981: 96–7) postulates that speaking,including its gesture component, is always and everywhere motivated andpurposive and as such plays a central role in mediating all human activity,whether this entails influencing others in social interaction or ourselvesin cognitive activity. Importantly, the other- and self-directed influence ofcommunication do not necessarily occur as separate and independentactivities; rather, as Valsiner (2001: 87) contends, “every world [sic] a personutters for others (heteroregulation) is simultaneously an act of regulation ofoneself (autoregulation)”. This does not mean, of course, that the meaning ofan utterance is the same for oneself as it is for the other, only that utterances,including their gestural accompaniments, have bidirectional regulatorypotential. Hence, at the same time that speakers tell a story for others (com-municative function), they construct an understanding of the narrative forthemselves (cognitive function). In this way, the generally assumed distinctionbetween communication and cognition is called into question, if not eliminatedcompletely (see Frawley and Lantolf 1985; Appel and Lantolf 1994).

From the above perspective, gestures can simultaneously externalize aspeaker’s personal understanding of a situation for an other person and aidthe speaker in developing this understanding. We therefore see no need toworry about whether gestures serve cognition or communication – theyserve both, in the unity (n.b. not identity) formed by thinking and speaking.In Vygotsky’s words:

Speech does not merely serve as the expression of developed thought.Thought is restructured as it is transformed into speech. It is not expressedbut completed in the word. Therefore, precisely because of the contrastingdirections of the movement, the development of the internal and externalaspects of speech forms a true unity. (1987: 251)

In the end, speaking and gesture form a functional communicative unitthrough which we make sense of things (i.e. regulate others and ourselves).We have more to say about this important notion in the following section.



Thinking for speaking and gesture

McNeill and Duncan (2000) examine the relevance of growth point for Slobin’sTFS framework and in so doing bring a new perspective on Slobin’sneo-Whorfian hypothesis. According to McNeill and Duncan (2000: 146), theopposition between image and language “is the key that unlocks speech” inthat they mutually influence each other in a “continual movement back andforth” that recalls Vygotsky’s (1987) foundational thinking on the dialecticrelationship between language and thought.

Similarly, for Slobin in the activity of speaking, thinking takes on aparticular quality as experiences are filtered through languages into verbalizedevents. Slobin (1996a) outlines three areas where linguistic relativity in theTFS framework may be relevant: L1 learning, the activity through whichchildren appropriate their initial and deeply engrained TFS patterns (tobe addressed in the discussion section); historical change, through whichcommunities potentially transform the patterns; and, the most relevant forour purposes, the learning of additional languages. All three areas entail adevelopmental perspective, which is important, because as Vygotsky (1978)argues, the key to understanding human thinking resides in its genetic (i.e.historical) qualities. Therefore, in order to explore human thinking, heproposed a research methodology that focuses on the formation (e.g. L1learning) and the reformation (e.g. L2 learning) of communicatively-basedmental activity. In the case of L2 speakers, the focus of the present study, theinteresting question is whether these individuals are able to regulate theirTFS activity in complex tasks through the second communicative system (asreflected in the dialectical unity of image and speech, which would suggesta new growth point) or whether they continue to rely on their original system.

Motion events

Languages differ typologically in how space and motion events, the concernof the present study, are expressed linguistically. Talmy (1985, 2000) groupslanguages according to how they incorporate motion and spatial relations. Aprototypical motion event has four components (Talmy 2000: 26):

a) figure: an object moving or located with respect to another object (ground)b) ground: a reference object in relation to which the figure movesc) path: trajectory or site occupied by the figured) motion: presence per se of motion of locatedness in the event.

Additionally, “a motion event can be associated with an external co-eventthat most often bears the relation of Manner or of Cause to it” (Talmy 2000:26). Hence, we can distinguish two additional features:



e) manner: the particular way the motion is performedf) cause: the efficient origin of a change in motion or location.

Considering how different languages express path, Talmy proposes twotypological categories: satellite-framed languages and verb-framed languages.(Slobin 2003 refers to these respectively as S-languages and V-languages,a convention that we adopt here.) English, a typical S-language, indicatespath of motion through particles or adverbs, while manner is encodeddirectly in the verb, as in the following example:

1) The little bird hops out of the cage.

In (1), the movement of the figure the little bird is expressed in the verb hop,which conflates motion and manner. The path trajectory of the figure againstthe ground the cage is encoded in the particles out and of. In contrast, Spanish,a V-language language, rarely conflates manner with motion, preferringinstead to encode this feature of events, if at all (as we will see in the datafrom the L1 Spanish speakers), through a separate lexical item. Path, on theother hand, frequently conflates with motion verbs in Spanish. Examples (2)and (3) illustrate common ways of describing motion events in Spanish:

2) El pajarito sale de la jaula dando saltitos.‘The little bird leaves from the cage giving hops.’

3) El pajarito da saltitos y sale de la jaula.‘The little bird gives hops and leaves from the cage.’

In (2), the verb sale conflates motion and path, and the clause dando saltitosindicates manner. In (3), the manner of motion is signaled through the verbda and the noun saltitos, while a second verb sale conflates motion and pathagainst the ground NP la jaula. Although (2) and (3) have different syntacticstructures, they are equivalent in meaning. The problem of translatingmanner from English into Spanish has been noted by Slobin (1996a), whocomments on how novels written originally in English lose as much as halfof their manner coloration in their Spanish translations.

Talmy (2000: 28) suggests three categories of motion + manner conflation,as illustrated in the following examples:

4) Non-agentivea) The rock slid/rolled/bounced down the hill.b) Smoke swirled/rushed through the opening.

Agentivec) I slid/rolled/bounced the keg into the storeroom.d) I twisted/popped the cork out of the bottle.

Self-agentivee) I ran/limped/jumped/stumbled/rushed/groped my way down the stairs.f) She wore a green dress to the party.



According to Talmy, the conflated constructions in (4) can (althoughperhaps a bit awkwardly) be expressed in a decomposed constructioncontaining a motion verb and a manner co-event, usually expressed in aseparate clause. Thus, a sentence such as (4a) could be formulated as Therock went down the hill, rolling in the process, (4d) might be expressed asI pulled the cork out of the bottle with a twist, and (4f) might be paraphrased asShe went to the party wearing a green dress.

Spanish, as already mentioned, only rarely conflates motion and manner,preferring instead to mark movement and manner separately. Hence, theSpanish equivalent of (4d) above conflates motion and path and decomposesmotion and manner: Saqué el corcho de la botella retorciéndolo ‘I pulled thecork from the bottle twisting it’ (Talmy 2000: 51). Nevertheless, Spanish canconflate motion and manner, as in the equivalent of (4f): La mujer llevó unvestido nuevo a la fiesta. Spanish also has an alternative way of expressing thesame meaning which decomposes motion and manner, and in this case theEnglish equivalent also separates motion and manner: La mujer fue a la fiestallevando un vestido nuevo ‘The woman went to the party wearing a newdress’.

A very interesting distinction between English and Spanish emerges incomparing how the two languages describe climbing activity. In the Englishconstruction Fred climbed the stairs, it appears that motion and path conflatein the verb climb, and indeed most dictionaries seem to make this assumption.However, climb can also combine with path satellites to indicate a downwardtrajectory, as in Fred climbed down the ladder. It can also co-occur with othersatellites to indicate a more or less horizontal movement, as in Fred climbedunder the fence. Climb can also combine with complex satellites to describemore complex trajectories, as in Climb down out of the tree and Fred climbed outfrom under the bed. These examples seem to show that path only optionallyconflates with a verb such as climb in English. As far as we can determine,path does not seem to function this way in Spanish; therefore, Spanish equiva-lents of at least some of the English examples with climb must be renderedwith a different verb in each case: Fred subió al árbol ‘Fred climbed up thetree’, Fred bajó del árbol ‘Fred climbed down from the tree’, and Fred pasó pordebajo de la valla ‘Fred climbed under the fence’.

Motion events and gesture

McNeill and Duncan (2000) have observed that Spanish and English speakerscoordinate their gestures differently with motion verbs. On the one hand,Spanish speakers tend to focus their path gestures on path verbs in whichmotion and path conflate or on ground NPs, and they tend mark mannerlexically, gesturally, or through a combination of both. On the other hand,English speakers focus their path gestures on satellites or ground NPsand encode manner in the verb where motion and manner conflate. They



may optionally signal manner through gesture but only if it is also indicatedlexically, normally through a manner verb. This may appear to be redundantinformation, but according to McNeill and Duncan it isn’t, because in theirview manner co-occurs in speech and in gesture in English when the speakerwishes to bring this feature of an event into focus.

Consider examples (5) and (6), discussed by McNeill and Duncan (2000:150):4

5) [but it rolls] him outHand wiggles (= rotates): manner information

6) [and he rolls . . . down the drain spout]Hand plunges straight down: path information only.

In both cases, different speakers are describing the same Warner Brotherscartoon scene in which Tweety Bird drops a bowling ball down a drain pipeas Sylvester the cat is climbing up the inside of the pipe. In (5) the speakerbrings the manner of the bowling ball’s motion into focus, as indicated inthe wiggling (= rotating) hand gesture that coincides with the mannerverb roll. In (6), on the other hand, the speaker’s focus is on the path of thecharacter’s motion, as signaled by the gesture spread over the satellite downand the ground drain spout. The difference between the two utterancesreflects a difference in growth points and thus a difference in TFS, despitethe use of the same verb in both cases.

Finally, consider the Spanish examples (7) and (8), also from McNeill andDuncan (2000: 150–1):

7) e entonces busca la ma[nera (silent pause)]‘and so he looks for the way’Gesture depicts the shape of the pipe: the ground

8) [de entra][r // se met][e por el][desague // ][sí?]‘to enter REFL goes-into through the drainpipe . . . yes ?’Both hands rock and rise simultaneously: manner and path (left hand onlythrough “mete”). Right hand continues to rise with rocking motion: path+ manner.

Both utterances were produced by the same speaker. In (7) the speakerdescribes neither manner nor path but ground, by forming a pipe shapewith his two hands. In (8) the speaker indicates path both verbally (‘goes-into through the drainpipe’) and gesturally by holding the shape of thedrainpipe initiated in (7) and raising his hands. However, he marks mannerthrough gesture, by rocking his hands back and forth while simultan-eously moving them upward. According to McNeill and Duncan, the pre-sence of a manner gesture (often co-occurring with path verbs and groundphrases) in the absence of verbally marked manner is a manner fog andis a frequently used pattern among Spanish speakers. Thus, in Spanish,



growth points “with manner are categorized and enter the linguistic systemas path and/or ground” (2000: 150). We will see this illustrated again inthe case of our Spanish L1 speakers both in their L1 and L2 Englishperformance.

The typological differences between Spanish and English with regardto how motion is encoded verbally and gesturally result in differences inthe dynamics of TFS patterns between users of the respective languages.In English, gesture and speech (verbs) “highlight manner when it is partof the speaker’s focus and when manner is not in focus, gesture does notencode it and need not synchronize with a manner verb, even if one ispresent” (McNeill and Duncan 2000: 151). In Spanish, marking manner is a“challenge appearing only when it is a focused component, and it is oftenomitted even when it is potentially significant” (ibid: 152). Thus, the Spanishutterance exemplified in (9a) with manner marked verbally and in gesture ispossible, as is the example presented in (10a) with manner indicated onlythrough gesture. The English equivalent in (9b) with manner encoded in theverb and marked in gesture is appropriate, while the English (10b) render-ing of the Spanish (10a) is problematic because manner is signaled throughgesture only:

9a) Tarzan salta [de liana en liana] por la selva.(hand makes a series of swinging motions from left to right)

9b) Tarzan [swings] through the jungle.10a) Tarzan [salta] por la selva.

(hand makes a series of swinging motions from left to right)10b) ?? Tarzan [jumps] through the jungle.5

These marked differences between Spanish and English are extremelyinteresting for L2 research because they allow us to explore how TFS aboutmotion events occurs in the performance of L2 speakers; that is, do thespeakers manifest appropriate L2 gesture patterns when communicating inthis language, or do they continue to rely on their L1 patterns of TFS? This isall the more interesting because the interface between gesture and speech isnot explicitly taught in language courses. Moreover, as McNeill’s researchhas shown, most speakers are unaware of the differences between their L1and other languages with regard to iconic gestures employed in TFS activities.Indeed, most speakers are not usually aware of these types of gestures incommunicating either in their L1 or their L2 (see Jiménez-Jiménez andCenteno-Cortés 2002).6

Gesture and L2 thinking for speaking

Stam’s (2001) study focused on the gesture–speech interface in the perform-ance of five intermediate and five advanced ESL (Spanish L1) speakers



describing McNeill’s Tweety Bird cartoon. Stam’s work is squarely situatedwithin Slobin’s TFS framework and addresses essentially the same issues asour study. Stam reports that some of her intermediate and advanced speakersexhibited the beginnings of such a shift toward an English TFS pattern,manifested as a slight increase in the percentages of path gestures synchro-nized with satellites, the preferred English pattern, as attested in Stam’sEnglish controls. This change was accompanied by a slight shift in bothgroups away from the preferred L1 Spanish pattern of marking path-onlygestures on the verb. For manner, both ESL groups followed the Spanish L1pattern, marking manner through a gesture even in its absence in speech,a pattern that is extremely rare in English and was attested in only oneinstance in Stam’s L1 English controls (to show velocity of movement).

To our knowledge, in addition to Stam’s (2001) research, only two otherstudies have been carried out on the interface between gesture and speakingin an L2 – Dushay (1991) and a recent study by Kellerman and van Hoof (inpress).7 Dushay investigated whether or not speech-focused gestures serveda communicative or a cognitive function. The participants in his study were20 fourth-semester university learners of Spanish as a foreign language whowere asked to describe a series of novel abstract figures and sounds. Half ofthe time learners spoke in their L1 English and half of the time in their L2Spanish. Contrary to what might be expected, when the participants spokein Spanish, particularly when describing sounds, they produced fewer speech-focused gestures than when speaking in their L1. Moreover, Dushay founda negative correlation between speaker fluency in the L2 and frequency ofspeech-focused gestures. The author concluded that his study failed to pro-vide direct support for either the communicative or the cognitive functionhypothesis of gesture.

The findings of Kellerman and van Hoof’s study are complex. Theycompared the narrations of Mayer’s (1969) frog story by seven L1 Spanishand seven L1 Dutch speakers in L2 English with their performance in theirrespective L1s as well as with L1 speakers of English. First of all, the L1Spanish speakers showed a strong tendency to transfer their L1 gesture–speech patterns to their L2 English performance, synchronizing path gestureswith verbs instead of satellites. The authors do not say much about how theSpanish speakers dealt with manner events in English. In two instanceswhere English speakers would likely use a manner verb to describe themotion of the bees in the story, two different Spanish speakers used thenon-conflated motion verb went: And the bees went on his back; The bees wentoutside. In both examples, while the verb conveys motion, an L1 Englishspeaker would most likely use a manner verb such as swarm to depict themovement of the bees rather than the path verb go. What is more, the speakersdid not seem to encode manner in gesture in either case, which, according toMcNeill’s analysis of Spanish L1, is acceptable when speakers opt not tofocus on manner. This particular pattern is also attested in the data from ourL1 Spanish speakers in their L2 English performance (see below).



Kellerman and van Hoof characterize their second finding as “mysterious”and as the “most intractable” of their study. Dutch, like English, is anS-language, and as such synchronizes path gestures with prepositionalphrases or grounds. However, in their L2 performance, where we wouldhave anticipated an S-language pattern, the majority of the Dutch particip-ants, in fact, produced patterns paralleling the Spanish speakers, with pathgestures co-occurring on motion verbs. For example, one L1 Dutch speakersynchronized a gesture with the verb threw: And Timmy was thrown [gesture= hand flicks from left to right] over his head. Talmy (2000) includes throwamong the verbs that conflate motion and cause, which in his analysis aresimilar to manner verbs in that the former focus on the object of an actionand the latter highlight its agent. Indeed, one of Kellerman and van Hoof’sL1 English participants also synchronized his gesture with the verb threw,which leads us to their third finding.

Half of Kellerman and van Hoof‘s L1 English speakers spread their pathgestures to both a verb and its satellite rather than marking them exclusivelyon the satellite, a finding that is contrary to what has been generally reportedin the gesture literature, the present study included (see Table 3 below). Theauthors speculate that perhaps this is possible in an S-language such asEnglish because of the greater cohesion between verbs and satellites, whereasin an S-language like Dutch, the verb–satellite nexus is often broken becauseof word order constraints, Dutch being a verb-final language. This is aninteresting claim which merits further investigation; however, we need tokeep in mind that verbs such as throw in English are, according to Talmy,not path verbs but are instead a variation of manner verbs, where motion +cause conflate. Kellerman and van Hoof, however, are silent on the nature ofthe verbs used by their speakers. To be sure, one of the L1 English speakersalso produced a path gesture that spread across the verb come and its satelliteback, in The dog comes back, a pattern not reported in McNeill’s research, asfar as we are aware. While Kellerman and van Hoof mention Slobin’s work,their study focuses more on the properties of gesture than it does on the TFSquestion.

The study

As mentioned earlier, the present study is not intended as a replicationof Stam’s research but as an extension of her work and to a lesser extentthe research of Kellerman and van Hoof. One of the problems that pre-vents a close comparison of the three studies is the matter of proficiency.Stam reports that her L2 speakers were drawn from intermediate andadvanced ESL classes at an urban midwestern U.S. university. However, it isimpossible to know if her advanced speakers are in any way comparable tothe advanced speakers included in our study. Kellerman and van Hoof donot report on the proficiency level of their L2 speakers. Also, the present



study considers L2 speakers of Spanish as well as English, while the othersconsidered L2 speakers of English only. Finally, none of the three studiesincludes a particularly high number of participants. As Kellerman andvan Hoof remind us, collecting and analyzing data on gesture is extremelylabor-intensive.

Research question

The research question for the present study is straightforward: Do advancedL2 speakers shift toward an L2 TFS pattern or do they rely on their L1 patternas evidenced in the gesture/speech interface? This question is expanded inthe following more specific hypotheses with regard to both manner andpath events:

Manner of motion eventsa) If L2 Spanish speakers maintain their L1 English TFS orientation, we

would expect them to have difficulties describing manner events, becauseSpanish has a relatively impoverished inventory of fine-grained mannerverbs.

b) Given that fine-grained manner verbs in Spanish are low frequency, wewould not expect L1 Spanish speakers to use these verbs when narratingspontaneously in their L1. Similarly, we would not expect L2 Spanishspeakers to be able to access these low-frequency verbs. However, ifthese speakers were relying on L1 English TFS, we would expect somedysfluency as they attempted to search for such verbs in their mentallexicons.

c) If L2 Spanish speakers maintain their L1 TFS, we would expect them toavoid manner fogs (i.e. expressing manner through gesture only).

d) If L2 Spanish speakers shift to L2 TFS, we would expect them to usemanner fogs and to avoid lexical searches for fine-grained manner verbs.

e) If L2 English speakers continue to use their L1 Spanish TFS pattern, wewould expect them to avoid fine-grained manner verbs and to usemanner fogs.

f) If L2 Spanish speakers shift to L2 TFS, we would expect them tosynchronize gestures with typical English conflated manner verbs and toavoid manner fogs.

Path of motion eventsa) If L2 Spanish speakers maintain their L1 English TFS orientation, we

would expect them to synchronize their path gestures with ground NPsor satellites, but not with path verbs. We would also expect these speakersto occasionally attempt to use complex satellites (e.g. up from, down into).



b) If L2 Spanish speakers shift to L2 TFS, we would expect them to usepath-conflated verbs and to synchronize gestures with the verbs or groundNPs, but not with satellites.

c) If L2 English speakers maintain their L1 Spanish TFS, we would expectthem to synchronize path gestures with conflated path verbs or ground NPs.

d) If L2 English speakers shift to L2 TFS, we would expect them to synchro-nize path gestures with ground NPs or satellites but not with path verbs,and we would expect them to use complex satellites.

e) Given that both languages allow path gestures to synchronize withground NPs, this is not an environment where shifts in TFS can beappropriately assessed.

Participants

A total of twelve speakers participated in the study – three L2 Spanish (L1English) speakers (participants 8, 9, 10), three L2 English (L1 Spanish) speakers(participants 4, 5, 6), three monolingual English controls (participants 1, 2,3), and three monolingual Spanish controls (7, 11, 12). All of the L2 particip-ants were highly proficient in their L2, having lived in the relevant L2 con-text for at least a year. All of the L2 participants were enrolled in a Spanishgraduate program at a large U.S. research university, and all were, or hadbeen, graduate teaching assistants in this program. The L2 English speakerswere natives of Spain, where they had earned undergraduate degrees inEnglish before coming to the U.S. One of the L2 English participants (4) wasmarried to an American and had been living in this country for six years atthe time of the study and used English on a daily basis. A second L2 Englishspeaker (5) had been living for two years in the U.S. with an Americanpartner who spoke only English. The third L2 English participant (6) hadbeen living in the U.S. for more than a year at the time of the study. Of theL2 Spanish participants, who were all Spanish instructors, one (8) had spentmore than a year and a half in residence in Spain. Another (9) had spent twoyears living and working in Spain before enrolling in the graduate program.The third L2 Spanish speaker (10) had spent nearly two years in Spainand was married to a Spaniard and used Spanish as the primary domesticlanguage.

Elicitation procedure

The participants were asked to construct a narrative chosen from FrogComes to Dinner (Mayer 1979). The story has no text, and the pictures weretransferred to transparencies so participants could view them as they werepresented sequentially on an overhead projector, leaving their hands free to



gesture. A description of that portion of the full story used in the study isincluded in the Appendix.

The task comprised two parts. First, the participants were seated facingone another and were asked to co-construct the narrative by describing toeach other the separate events of the story projected on a wall behind theirpartner. Each participant only saw half of the total number of transparencies.Thus, in any given pair of speakers, participant A only saw the odd-numbered pictures in the sequence and participant B only saw the even-numbered pictures. For the first task, participants were paired up as follows:

Narrated in EnglishParticipants 1 (L1 English) and 2 (L1 English)Participants 3 (L1 English) and 4 (L1 Spanish)Participants 5 (L1 Spanish) and 6 (L1 Spanish)

Narrated in SpanishParticipants 11 (L1 Spanish) and 12 (L1 Spanish)Participants 7 (L1 Spanish) and 8 (L1 English)Participant 9 (L1 English) and 10 (L1 English)

In the second task, participants individually re-told the entire story toone of the researchers in the language of the initial narrative construction.Participants were informed that they would be both audio and video tapedthroughout both sessions. The participants were told that the researcherswere interested in how they constructed narratives in their L1 and their L2.They were not told of our specific interest in the gesture–speech interface.They were asked to be maximally expressive and use as much detail aspossible in relating the story.

Coding and transcription

Even though speakers produced a wide array of gestures, including deictic,beats and emblems, only gestures related to motion events were consideredfor analysis. Following Stam (2001), two categories of gestures connected tomotion events were coded for each participant: path only, and path andmanner gestures together. Using McNeill’s coding system (see note 4), thesynchrony of the stroke phase of gestures with speech was analyzed bythree of the researchers working together using standard VCRs. First, move-ments that were considered relevant gestures were identified; second, thegesture phase was coded by determining the onset, or preparation phase,and the offset, or retraction phase, of the gesture; finally, the stroke of thegesture was determined semantically (the content-bearing part of the gesture)and kinesically (where the gesture movement has its peak in the quality of



effort: the motion is concentrated on the form of the motion itself). Theanalysis was then verified by Negueruela using professional-grade VCRs inDavid McNeill’s laboratory at the University of Chicago. We also identifythe relevant type of gesture with capital letters (path or manner) and separatethis from a description of the gesture with a colon. Example (11) illustratesthe coding scheme:

11) He’s about [to be grabbed by], I think it’s the drummer . . .(manner: both hand and arms co-ordinated, motion of grabbing8)

Analyses

We divide the analysis into two parts: the first focuses on path constructionsand the second on manner constructions. In each case we first present inquantitative format the overall frequency of relevant gestures for each of thetwelve participants. We follow this with a close analysis of specific utterancesfrom individual speakers that are especially revealing with regard to potentialchange in TFS orientation. Although the utterance-based analysis focuseson several problematic cases, which might give the impression that the L2speakers were in fact not highly proficient, we hasten to point out that, withthe exception of utterances dealing with motion events, the speakers’performances in both languages were quite good.

We begin with the overall frequencies of motion-event gestures producedby each speaker in both tasks. Tables 1 and 2 provide the overall frequenciesof motion-event gestures for each task as well as the combined percentagesproduced by all speakers. In general (except for participants 3, 9, and 12),the second part of the task (i.e. narrating the complete story) generated agreater number of motion-event gestures than did the first.9

Table 1. Frequency of motion-event gestures produced with motion verbs inEnglish narratives in each task and combining both tasks

Participant Task 1 Task 2 Combined

1 L1 English 69% (9/13) 81% (13/16) 76% (22/29)2 L1 English 20% (2/9) 82% (14/17) 62% (16/26)3 L1 English 73% (19/26) 17% (2/12) 55% (21/38)

4 L2 English 65% (13/20) 75% (12/16) 69% (25/36)5 L2 English 8% (1/12) 96% (26/27) 69% (27/39)6 L2 English 28% (2/7) 72% (13/18) 60% (15/25)

(total number of gestures/events in parentheses)



Table 2. Frequency of motion-event gestures produced with motion verbs inSpanish narratives in each task and combining both tasks

Participant Task 1 Task 2 Combined

7 L1 Spanish 27% (3/11) 62% (8/13) 46% (11/24)11 L1 Spanish 97% (31/32) 100% (28/28) 98% (59/60)12 L1 Spanish 81% (13/16) 79% (23/29) 80% (36/45)

8 L2 Spanish 25% (2/8) 33% (2/6) 29% (4/14)9 L2 Spanish 92% (23/25) 90% (9/10) 91% (32/35)

10 L2 Spanish 44% (4/9) 59% (10/17) 54% (14/26)


Table 3. Frequency of path-only gestures in English narrations with verb, satellite,or ground NPs

Participant Verb Satellite Ground NP

1 L1 English 0% (0) 78% (7/9) 12% (2/9)2 L1 English 0% (0) 100% (8/8) 0% (0)3 L1 English 0% (0) 43% (3/7) 57% (4/7)

4 L2 English 33% (3/9) 11% (1/9) 56% (5/9)5 L2 English 23% (3/13) 7% (1/13) 70% (9/13)6 L2 English 29% (2/7) 0% (0) 71% (5/7)


Path analysis

Tables 3 and 4 present the percentages for path-only gestures in English andSpanish distributed across verb, satellite and ground NPs. It seems clear froma comparison of these tables that both L2 groups maintain their L1 TFS.

Participants 1, 2, and 3 in Table 3 exhibit the expected L1 English TFSpattern, as documented in McNeill’s research. Thus, path gestures are con-centrated on satellites and ground NPs. On the other hand, the L2 Englishspeakers (participants 4, 5, 6) manifest a different pattern – a pattern thatreflects L1 Spanish TFS even though they are verbally communicating throughtheir L2. The majority of path gestures are encoded either on the verb or onthe ground NPs, as occurs in Spanish L1. To be sure, there are two instancesof what look like English patterning, with the gesture occurring on a satellite,but given that the occurrence is not particularly robust, we cannot be surethat it shows a shift in TFS.



Table 4. Frequency of path-only gestures in Spanish narrations with verb, satellite,or ground NPs

Participant Verb Satellite Ground NP

7 L1 Spanish 67% (4/6) 0% (0) 33% (2/6)11 L1 Spanish 82% (9/11) 0% (0) 18% (2/11)12 L1 Spanish 80% (4/5) 0% (0) 20% (1/5)

8 L2 Spanish 0% (0) 0% (0) 100% (1/1)9 L2 Spanish 44% (4/9) 12% (1/9) 44% (4/9)

10 L2 Spanish 25% (1/4) 25% (1/4) 50% (2/4)


In Table 4, we see the percentages of path gestures produced by L1and L2 speakers of Spanish. In narrating the story in their first language,participants 11 and 12 exhibit the anticipated Spanish pattern in which pathgestures co-occur with the oral production of the verb. Participant 7, also anL1 Spanish speaker, produced a lower percentage of path gestures on theverbs and a greater percentage on the ground NPs than did the other two L1Spanish speakers. Importantly, however, none of the L1 Spanish speakersproduced path gestures on satellites. The L1 English speakers exhibit pathgestures in L2 Spanish that seem to resemble patterns used by L1 Spanishspeakers (cf. Table 3), thus indicating a possible shift in TFS. However, as wewill discuss in our analysis of specific utterances, instead of showing a shift,the quantitative data appears to be masking the speakers’ struggle to usegestures on satellites when Spanish does not offer the same semanticallycomplex set of prepositions as does English. As Slobin (1996a: 83) points out,“Spanish prepositions, by contrast to English, provide minimal locativespecification.”

Although the quantitative evidence is suggestive, it is inconclusive withregard to shifts toward an L2 TFS orientation. To shed more light on thematter, a closer analysis of those instances that seem to be deviating from L1TFS is necessary. We carry out this analysis for path gestures in the nextsection.

Utterance-based analysis for path

We begin with participant 9 (L1 English narrating in L2 Spanish). In thequantitative data (Table 4) this particular speaker produced four pathgestures on verbs, one on a satellite and four on ground NPs. On the face ofit, this might seem to indicate a shift to a more Spanish-like TFS pattern.However, if we examine specific utterances, as for instance given in (12), wesee that things are not quite so straightforward:



12) [el sapo entró /, bueno, saltó / como dentro del saxofón]‘the toad entered, well, jumped, like inside the saxophone’(path: hand comes up marking trajectory, stops, and then again markstrajectory and stops)

In (12), the speaker seems to feel that the verb entrar ‘to enter’ does notcapture the meaning of ‘jumping into a saxophone’. While producingthe verb, the speaker begins to deploy a path gesture but then halts in themiddle of its trajectory. She then utters the word bueno ‘well’, indicating afilled pause, which allows her time to modify her utterance, resulting in theverb saltó ‘jumped’. The narrator then continues the path gesture, but againinterrupts it in mid-stream as she struggles to externalize an appropriatepath preposition. The problem is that because Spanish verbs conflate pathwith motion, a complex preposition, as would likely be used in English, isnot a viable option to indicate trajectory. Indeed, in Spanish a simple preposi-tion such as en ‘in’ does not inherently convey path. Thus, an utterance suchas La rana saltaba en el saxofón ‘the frog was jumping in the saxophone’ canhave two meanings, one in which the frog is in the act of jumping into thesaxophone and another in which the frog is jumping while already inside ofthe instrument. Thus, L1 speakers prefer the path-conflated verb meterse ‘toget into’, which eliminates any ambiguity (see example 13). Instead of usinga simple preposition, such as en or al, participant 9 constructs a very complexprepositional string comprised of three elements (como, dentro, del), whichclearly belies an English TFS pattern.

Compare (12) with (13), which was produced by participant 7, an L1Spanish speaker:

13) la [rana se ha metido en el saxofón]‘the [frog got into the saxophone]’(path: hand cupped describes trajectory, other hand mimicking the bellof the saxophone)

The speaker uses meterse ‘to get into’, which conflates motion and path, toexpress directionality and synchronizes this verbal meaning with a pathgesture in order to more precisely illustrate the trajectory of the frog’s move-ment. This example is particularly interesting because it fails to express eitherverbally or through gesture the manner of the frog’s locomotion. That is,meterse does not impart the idea that the frog either jumped or hopped intothe saxophone, only that it moved into the instrument. Linguistically, thespeaker could have indicated the manner of the frog’s motion with the phrasede un salto ‘with a jump’. The fact that a manner gesture is also avoided inthis case is interesting, given that Spanish speakers can express mannerthrough a manner fog. The speaker may have assumed that world know-ledge of how frogs typically move about was sufficient for the hearer toknow the manner of its motion.



In (14), participant 3, an L1 English speaker, produces a manner verb anddeploys a complex path gesture on a series of three satellites in order todepict the trajectory of the jumping motion of the frog from the boy’s pocketthrough the restaurant and into the saxophone. This way of indicating con-tinuous movement through a single space, according to Duncan (ms.), istypical of English TFS.

14) the frog . . . [jumps out of the boy’s pocket through the restaurant into asaxophone](path: three strokes, one falling on each of the prepositional satellites;the first with the index finger extended while the hand emerged fromthe pocket, the second with the hand moving away from the bodyand the index finger still extended, and the third with the hand movingdownward and the index finger still extended)

Participant 4 (L1 Spanish), with extensive immersion experience in theU.S., produced the path utterance in (15) where the gesture synchronizedwith a portion of the satellite and the ground NP to describe the same eventas in (14), thus potentially suggesting a shift to an English TFS orientation.In Spanish, a path gesture can occur on a ground NP but not on a satellitepreposition.

15) the frog jum[ped on /// to the saxophone](path: both hands coming up and moving away from the body)

The odd feature of this example, however, is the noticeable pause thatoccurred after the preposition on, the site where we would normally expectan English L1 path gesture to begin. Compared to the L1 English examplein (14), there is only a minimal description of the frog’s full trajectory. Inessence, the speaker is saying that the frog jumped in the direction of thesaxophone.

In (16), participant 6 (L1 Spanish) does not use any gesture to convey themeaning that the frog ‘jumped into the saxophone’:

16) she went to the musician and got inside the saxo(no gesture)

Again, in this case, the meaning expressed by got inside reflects morethe Spanish meterse used in (13) than the English jumped into. The speakerdescribes the setting but not the trajectory of the frog’s motion. Particularlyrevealing is the speaker’s use of two finite verbs (went, got), a clear parallelto a Spanish pattern which, unlike in the L1 English example (14), carves upthe full space into subspaces, as illustrated in the possible Spanish renderingof the English utterance: De un salto la rana salió del bolsillo, cruzó el restaurante,y cayó en el saxofón ‘In a single jump, the frog left the pocket, crossed therestaurant and fell into the saxophone’.



In (17), the same L1 Spanish participant continues to display L1 TFSpatterns by synchronizing the path gesture with the verb instead of thesatellite as she describes the woman falling backwards in her chair:

17) she [falls in it](path: entire upper torso moves in a backward direction)

If we now compare (17) with (18) from participant 1 narrating the sameaction in L1 English, we observe a different pattern:

18) the woman [ falls back](path: entire upper torso moves in a backward direction)

In (18), the speaker, like speaker 6 in (17), also moves his upper torso in abackward direction, but does so on the satellite rather than on the verb, thusrevealing an English TFS orientation.

In the case of the L2 Spanish speakers, we also note clear L1 Englishpatterns, as illustrated in (19), produced by participant 9:

19) y me parece que se va a caerse [pa detrás]‘and it seems to me that she is going to fall [backwards]’(path: hand and body leaning back)

Here the speaker follows the L1 English pattern, encoding path verbally onan invented complex satellite rather than on the verb, as would be expectedfrom an L1 Spanish user. In her gestures she also displays an L1 English pat-tern, as she uses two strokes, one on pa ‘toward’ and another on detrás ‘back’.

Manner analysis

The data presented in Tables 5 and 6 show that L2 speakers by and largemaintain their L1 gesture patterns when narrating in their second language;nevertheless, we uncovered a few revealing variations that are worth com-menting on. The L1 Spanish participants (4, 5, 6) manifest a Spanish patternfor manner gestures in their L2 English, producing these gestures even whenmanner is not marked verbally. Their behavior is clearly distinct from the L1English speakers and supports the idea that high levels of verbal proficiencydo not necessarily reflect the ability to think through the L2.

In Table 5, we see that the L1 English participants express manner withgestures when it is also present in speech and, with only two exceptions,fail to use gesture when a manner verb is also missing in their verbal pro-duction, a pattern that is possible in English when manner is not in focus.The L1 Spanish speakers, however, rely heavily on manner gestures whenspeaking in English and particularly where manner is not marked on theverb – a pattern that strongly parallels L1 Spanish (i.e. a manner fog).



Table 5. Frequency of manner gestures with/without verbally expressed mannerin both tasks in English narratives

Participant Verbally expressed Not verbally expressed

1 L1 English 89% (8/9) 11% (1/9)2 L1 English 100% (1/1) 0% (0)3 L1 English 92% (12/13) 8% (1/13)

4 L2 English 22% (4/18) 78% (14/18)5 L2 English 11% (1/9) 89% (8/9)6 L2 English 0% (0) 100% (9/9)


In Table 6, we observe that L1 English narrators manifest considerablyfewer instances of manner in their L2 Spanish when compared to L1 Englishperformance (Table 5). Speaker 9 is especially interesting, since she uses ahigh percentage of manner gestures in the absence of manner verbs in speech– a pattern that at first glance appears to reflect a Spanish rather than EnglishTFS orientation. It is difficult to determine the pattern for the other two L2Spanish speakers, since they only encoded manner a total of four timesbetween them.

Utterance-based analysis for manner

As with path utterances, a close analysis of manner utterances reveals thatthe speakers maintain their L1 TFS in the L2. In example (20), we observethat participant 9 expresses manner through gesture in her L2 Spanishperformance when it is not encoded in speech, an apparent L1 Spanishmanner fog pattern (cf. example 21).

Table 6. Frequency of manner gestures with/without verbally expressed mannerin both tasks in Spanish narratives

Participant Verbally expressed Not verbally expressed

7 L1 Spanish 25% (1/4) 75% (3/4)11 L1 Spanish 27% (4/15) 73% (11/15)12 L1 Spanish 23% (6/26) 77% (20/26)

8 L2 Spanish 100% (2/2) 0% (0)9 L2 Spanish 44% (4/9) 56% (5/9)

10 L2 Spanish 50% (1/2) 50% (1/2)




20) la ensalada [está // como en medio aire]‘the salad [is like in mid-air]’(manner: hand shaking palm down)

In (20) it is important to note the crucial pause following the copula verbestá. The speaker begins to deploy a manner gesture but is unable to find anappropriate verb in her L2 Spanish lexicon, and at this point she pauses, asign that she is most likely engaged in a lexical search for just such a verb(see Goldman-Eisler 1968).10 It is likely that this speaker knows the high-frequency Spanish cognate moverse of English to move, which would be acommon way of describing the motion event in Spanish. It is thereforesignificant that she does not use this verb. The reason, in our view, is that foran L1 English speaker, whose language is saturated with manner verbs (Slobin2003), moverse is not sufficiently nuanced to bring manner into focus. It isalso significant that the beginning of the manner gesture synchronizes withthe auxiliary verb está and extends beyond the pause.

Recall that McNeill and Duncan (2000: 151–2) have suggested that thekey to focus in English resides in the co-occurrence of speech and gesture.This is because, typologically, English has a rich inventory of verbs whichconflate motion and manner, and so it regularly encodes manner of motionin speech. To bring manner into focus, therefore, English speakers need todo more than produce a manner verb. They must doubly encode manner inspeech and in gesture. In contrast Spanish, typologically a V-language, doesnot have a robust repertoire of manner-conflated verbs and, consequently,normally encodes manner through gesture (i.e. manner fog). In order to bringmanner into focus, therefore, Spanish speakers must minimally encode it inspeech, either in the verb (when available) or in a separate lexical item, as inva flotando ‘goes floating’.

Returning now to example (20), after the critical pause in which thespeaker is unable to locate an appropriate Spanish manner verb, shemanages to cobble together a string which at face value is an adverbialphrase, como en medio aire ‘like in mid-air’; however, this string occupies thesyntactic position that would otherwise be filled by the progressive form ofa Spanish verb, such as tambaleandose ‘tumbling about’ (one of the fewconflated manner verbs), and it imparts a meaning that is similar to theSpanish verb. In a sense, following the failed search, the speaker creates averb which would not be recognized as such in Spanish but which, in thecaldron of communicative interaction, takes on this function. This seems tobe a very nice example of what Hopper (1998) postulates for emergentgrammar, in which the speaker grammaticalizes a lexical string, normallyconstrued as an adverbial, as a ‘verb’ that carries manner.

The important point is that the speaker’s lexical search was trigged bythe need to locate a verb that expressed manner, which along with thealready-initiated manner gesture would bring manner into focus, thus dis-playing an English TFS pattern. If speaker 9 had been using Spanish TFS,



moverse, as mentioned above, would have been an appropriate choice, but itwould not have brought manner into focus. This could only have beenachieved through use of the verb tambalearse. If the speaker had deployedthis verb, however, she would not have needed to also mark manner throughgesture in order to bring this feature of the event into focus in Spanish. Forthese reasons, we do not consider the pattern in (20) to represent a shifttoward Spanish L1 TFS.

It is interesting to compare (20) with the L2 English performance of oneof Kellerman and van Hoof’s Dutch L1 speakers. In describing the action ofthe dog in one of the scenes in the frog narrative, the speaker says the dogfalls out of the window and uses a “tumbling gesture that conflates mannerand path” synchronized on the satellite and ground NP, indicated in boldscript. It seems clear that because of the manner gesture the speaker wantsto communicate the manner of the dog’s falling motion, which cannot beachieved through the verb fall. If the speaker did not know the appropriateEnglish manner verb tumble, then use of fall + a manner gesture couldwell have functioned as a compensatory strategy, much in the way speaker9 compensated for her inability to access the appropriate Spanish verbtambalearse. In a sense, the Dutch speaker created (grammaticalized) amanner verb through a combination of a common motion verb + mannergesture.

The interesting feature in the Dutch speaker’s utterance is that the gesturedoes not coincide with the verb but synchronizes with the satellite andground NP. Recall from our discussion of focus that English speakers canuse manner verbs without manner gestures, as in example (6) where thespeaker focuses on the path of the motion instead of on its manner. Whenmanner is brought into focus, however, a manner verb and a synchronizedmanner gesture is the preferred pattern, as in (5) above. In the case at hand,then, using a manner gesture without an overt manner verb is an inappro-priate English pattern; moreover, using a manner gesture that fails to coincidewith the verb is also problematic. However, if the speaker wanted to focuson path while at the same time expressing manner of the motion, as in (6),then the pattern she devised makes sense: manner = motion verb + mannergesture, path = manner gesture moving in a downward trajectory on thesatellite and ground NP. It would have been quite cumbersome to have usedtwo distinct gestures – a tumbling gesture on fall followed by a downwardpath gesture on the satellite.

Compare now (20) with example (21) from participant 11, example (22)from participant 3 and example (23) from participant 5:

21) la ensalada [echa un desastre] (L1 Spanish)‘the salad [is a disaster]’(manner: hand shaking)

22) the plate’s [kind of tumbling a little bit] (L1 English)(manner: hand shaking)



23) and [the cup, the plate, the fork are all falling off the table] (L2 English)(path + manner: four consecutive strokes with both hands, palmsfacing each other, vigorously moving upward (last stroke morepronounced).

In (21) the speaker has no problem encoding manner gesturally, even whenit is not present in speech. In this case, however, Spanish is the speaker’s L1and she finds an appropriate way through gesture alone of encoding thetumbling motion of the salad plate as it was being carried by the waiter.Note that she could have used the Spanish manner verb tambalearse; to doso, however, would have brought manner into focus, an option the speakerapparently did not wish to express.

In (22), an L1 English speaker narrates the same event related in (21) but,in contrast to the L1 Spanish speaker, brings manner into focus through useof a nuanced manner verb synchronized with a manner gesture. There isalso something else interesting going on in (22). Notice that the speakermodulates the tumbling motion through use of the modifiers kind of . . . alittle bit wrapped around the verb. The modifiers themselves, althoughintended as a verbal means of modulating the intensity of the tumblingmotion, are open to further modification, because for the speaker it doesn’tdo justice to the degree to which the plate is moving about. The shaking-of-the-hand gesture, however, serves to modify the modifier, as it were, andthus provides a more precise description of the motion.

Example (23) is especially interesting because in this case participant 5(L2 English) merges path and manner gestures and synchronizes these withthe path verb fall, thus reflecting L1 Spanish. The gesture seems to contradictthe meaning of the verb, since the speaker uses fall while the gesture depictsan upward motion (both hands coming up), which portrays path of motion.However, given that the scene is very much about the manner of the plate’smovement and given that the other speakers used manner in speech, ingesture, or in both, we think that it is not unreasonable to assume that themotion of the speaker’s hands in this case also encodes manner (a flyingupward motion of plate, fork and salad), particularly since the gestures occurwith repeated strokes on the items mentioned as flying into the air and onthe verb. Moreover, even if the gestures only indicated path, which we don’tthink is the case, in an English TFS pattern they would have likely synchro-nized with the satellite off or the ground NP the table. Once again, despitebeing very proficient in L2 English, the speaker manifests a typical Spanishmanner fog, encoding manner exclusively in a gesture.

Consider the next set of examples:

24) the [frog appears] . . . from inside the salad (L2 English)(manner: both hands coming up)

25) she goes to take a bite from the salad and out jumps the frog (L1 English)(no gesture)



26) De pronto se asoma el sapo sonriendo a la mujer. (L2 Spanish)‘Suddenly the toad sticks out smiling at the woman.’(no gesture)

27) Le aparece la rana. (L1 Spanish)‘To her [i.e. the woman] appears the frog.’(no gesture)

In (24) participant 4 (L2 English) uses the English equivalent of the Spanishverb that could be used to describe the event (cf. 27). However, despite theircognate status, the semantics of appear in English and aparecer in Spanish donot match. Again, the speaker uses a manner gesture when there is noverbally expressed counterpart. In (25), participant 1 (L1 English) does notuse a manner gesture to describe the scene. It is worth noting the pre-positioning of the particle out vis-à-vis jumps, produced with a suddenincrease in volume – which in our view, imparts a sense of manner whichcould have been expressed with the adverb suddenly.

In (26), participant 10 (L2 Spanish) uses the verb asomarse ‘stick out’,which from the perspective of L1 Spanish is somewhat odd in this context.This is preceded by the phrase de pronto ‘suddenly’, which lexically conveysthe manner of the motion. No gesture is produced. Hence, the utteranceseems to reflect an L1 English TFS pattern without focus on manner (i.e.manner is verbally encoded but not marked in gesture). In (27), participant7, an L1 Spanish speaker, fails to describe the manner of the frog’s motioneither in speech or in gesture, opting instead to describe the state of affairs,a typical Spanish pattern.

We conclude our analysis of specific utterances with the one examplethat potentially signals a shift toward English TFS. This excerpt comes fromparticipant 5 (L1 Spanish), who, it will be recalled, had been living with anEnglish-speaking partner in the U.S. for three years at the time of the study.The utterance is intended as a description of the action in which the frogleapt from the woman’s salad plate onto her face:

28) and all of a sudden the frog [jumps off the table](path: right hand moves towards her face)

The speaker uses an appropriate manner verb (cf. 14) and marks path on thesatellite. However, the difficulty here is that only part of the frog’s trajectoryis verbally signaled. We are told that it jumps off the table but not that itstrajectory is towards the woman’s face. This is indicated exclusively in thespeaker’s gesture. The speaker does not say anything else about the move-ment of the frog towards the woman following this utterance and insteadshifts her attention to the next event – the frog jumping into the glass ofwine. A more appropriate English TFS pattern would have been somethinglike the frog jumps from the table towards the woman’s face. In (28), then, whilea path gesture coincides with a path satellite, the satellite nevertheless is



incomplete. A complete TFS shift implies not just synchronizing the gesturewith the satellite but also describing the entire trajectory of the figure.

Discussion and conclusions

The foregoing analysis shows that the L2 speakers participating in the studyhave not shifted their L1 TFS patterns toward the patterns used by speakersof the L2. In fact, some speakers had to create ways of dealing with typologicaldifferences between English and Spanish in order to maintain their L1 patternsof TFS. L1 Spanish speakers narrating in their L1 or L2 (English) showed astrong tendency to focus their path gestures on verbs or ground NPs. Thissame group of speakers to a large extent avoided verbs that conflate motionand manner in either language and used manner gestures in L2 Englisheven when manner was not clearly encoded in the verb. The L1 Englishspeakers concentrated their path gestures on satellites or ground NPs, whetherrelating the narrative in their L1 or their L2, and avoided synchronizing pathgestures with path verbs, the preferred L1 Spanish pattern. The most inter-esting data emerged from the L2 Spanish narratives, where the frequenciesrevealed a fairly high percentage of manner gestures (compared to L1 Englishfrequencies) occurring in the absence of clear manner verbs in speech, apattern that at face value would indicate a shift in TFS to the L2. However,our analysis of specific utterances revealed that the pattern most likelyresulted from the speakers’ inability to find a suitable manner verb inSpanish, and therefore we concluded that the pattern represented a compen-satory strategy rather than a shift in TFS.

In general, the above findings align more closely with the findings inKellerman and van Hoof (in press) than they do with those in Stam (2001).Although the findings of our study do not jibe with Stam’s, we are at thispoint not prepared to argue that L2 speakers are never able to shift their TFSorientation. On the contrary, the findings of all three studies taken togetherpaint an interesting and complex picture of L2 TFS. While some individualsmay move toward the L2 pattern, others, even at high levels of proficiency,for whatever reasons may not, and this in itself is the stuff of future research.Although Kellerman and van Hoof’s study did not directly address thematter of TFS, the performance of their L1 Spanish speakers neverthelessclearly reveals an L1 TFS preference in their L2 English performance. As wepointed out earlier, however, we have no way of drawing direct comparisonsacross the three studies because we are unsure of the comparability of thespeakers’ proficiency levels in each study. However, in our view there is noreason to assume that verbal proficiency, however it is construed, bears adirect relationship to the TFS patterns promoted by that language. In otherwords, we want to avoid two inappropriate assumptions: that improvementin verbal proficiency will result in shifts in TFS; and, by the same token,and perhaps more importantly, that one cannot be a proficient user of a



language unless and until one has modified one’s TFS. Our reasons formaking this argument are grounded in the work of McNeill and his col-leagues, as well as Slobin’s recent research (2003), and in the theoreticalwritings of Vygotsky.

According to McNeill (2002: 8), “to ask if gesticulations are well-formedis missing an essential point about them, which is that they are idiosyncraticand created at the moment of speaking”. In this sense, L2 speakers deploygestures that have their own quality and are always “well-formed” in thecontext of communication. We do not see the L2 speakers’ struggle to describethe motion events of the frog story, signaled by such features as pauses, asevidence of lack of proficiency. Rather we believe that these pauses reflectthe dialectic struggle of learners to reconcile their habitual thinking-for-speaking patterns with the L2. In this sense, all gestures are well formed atthe moment of speaking. This also means, among other things, that even ifspeakers develop a high degree of verbal proficiency in an L2, it need not beaccompanied by a high degree of cognitive proficiency (we expand on thisimportant matter below). We hasten to add that we are not about to makean argument for the need to develop tests of L2 cognitive proficiency, as wedo not want to further complicate matters for those individuals struggling tolive their lives through another language by recommending yet anothervariable to be assessed. We see nothing wrong in maintaining an L1 TFS inan L2.

Slobin (1996a) notes that linguistic features closely connected to TFS tendto be more resistant to historical change than other features of a language.This stands to reason, because if the TFS properties changed frequently, thiswould have significant consequences for how speakers of the language inquestion perceived events in the world. In essence, it would mean that thespeakers would be constantly changing their minds. Slobin (2003: 170) pointsout, for example, that English manner-of-motion verbs formed coherentsemantic domains from the time of Old English, “with many new verbs beingadded ever since”. Over the course of its history, then, the English lexiconbecame “highly saturated” with regard to the semantic space of manner ofmotion (p. 163). Languages such as Spanish, while possessing an inventoryof manner verbs, do not provide their speakers with the multitude ofdistinctions for conceptualizing manner of motion as does English. Whatthis means is that in ontogenesis, children born into English-speakingcommunities will be immersed in talk that draws their attention to thefine-grained (Slobin’s term) aspects of motion events, not only through theverbal code but also through the gestures deployed by their interlocutors.In languages with low manner saturation, such as Spanish, children willcertainly have their attention drawn to manner-of-motion events, but this islikely to be in a less graphic way than in the case of English. Thus, while aSpanish-speaking child might be told that un perro está corriendo por el jardín‘a dog is running through the garden’, an English-speaking child might betold that a dog is scampering through the garden. Consequently, according to



Slobin (2003: 164), English speakers are more likely to develop a “rich mentalimagery of manner of motion”, and “manner of motion will be salient inmemory of events and in verbal accounts of events”.

Slobin’s account coheres nicely with a sociocultural theoretic understand-ing of the human mind, where mind develops as children appropriatelinguistically organized concepts made available to them by members of acommunity as they are drawn into the social practices of that community.Vygotsky (1978: 35) states that “the system of signs restructures the wholepsychological process and enables the child to master his movement. It reconstructsthe choice process on a totally new basis” [italics in original]. As Vygotskystressed, this process is, by and large, non-reflective (see also Scollon 2002).Thus, with regard to motion events, for English speakers “manner is aninherent component of a directed motion event along a path” [italics added],but for Spanish speakers it is “much less salient and attention is focused onchanges of location and the settings in which motion occurs” (Slobin 2003:175). Hence, English speakers are inclined to conceive of manner and motionalong a path as “a single conceptual event” [italics in original], while Spanishspeakers conceive of these same events as “activities that take place inspecified geographic regions” (ibid.). This distinction is nicely illustrated inour analysis of examples (14) and (15) above, where the L1 English speakersused a single manner verb followed by a series of path satellites that tracedthe frog’s trajectory from the child’s pocket through the restaurant andultimately into the saxophone, as if it were a single motion, whereas the L2English speaker, in describing the same event, omitted the middle portion ofthe trajectory. Furthermore, as we pointed out, capturing the same elementsof the event as presented by the L1 English speaker in Spanish requires threedifferent finite clauses, which in essence breaks the frog’s trajectory intoseparate geographic regions.

If, as Vygotsky proposes, human mental activity is organized inaccordance with concepts appropriated as a consequence of participation in(linguistically) mediated social practices, then, as Slobin (2003: 179) suggests,TFS may not merely influence how people talk about events but, moreimportantly, how they experience those events “they are likely to talk aboutlater” (ibid.). Slobin calls this the “anticipatory effects” of language, whicharise during experience time, when “prelinguistic or nonlinguistic coding”takes place as the person attends “to those event dimensions that arerelevant for linguistic coding”, and at speaking time, when TFS comes intoplay as the person attends to and accesses “the linguistically codable dimen-sions” of the event (ibid.). While we find this way of conceptualizingthe interaction between thinking and speaking attractive and entirelycommensurate with Vygotsky’s theorizing on the thought–speech dialectic,the parameters of what it means for a person to be likely to talk aboutsomething later are not very well specified. In research situations, which arethe focus of Slobin’s proposal, participants are normally expected to talkabout events either online as they experience them or shortly thereafter.



Thus, the testing time, when the consequential effects of TFS are assessed,is fairly well circumscribed.

We have assumed, however, that anticipatory effects also hold fornon-research circumstances. In such cases, what does it mean to say thata person is likely to talk about a particular experienced event later? Wepropose that a reasonable answer to this important question is found inVygotsky’s (1987) notion of inner speech; that is, the psychological derivativeof social speech through which the culturally constructed concepts of a com-munity are internalized during ontogenesis of the individual. According toVygotsky, as inner speech develops from social speech, the formal propertiesof the language are jettisoned until only meanings are left. These meaningsmediate people’s mental activity as they go about the business of living inthe concrete world of events, even though people are generally unaware ofthe mediating effects of inner speech (see Sokolov 1972). From this perspec-tive, then, an individual’s experience of an event is mediated by inner speechregardless of whether they anticipate speaking about the event in the futureor not. If we recognize that inner speech is the psychological descendant ofsocial speech, then experiencing an event is very much a linguistic activity,but one in which the speaker and the addressee are situated within the sameindividual (see Vocate 1994).

Inner speech is closely aligned with Bourdieu’s broader notion of habitus,which is understood as a set of bodily dispositions arising from the person’saccumulated experience of social activities and which, like inner speech,“generate practice, perceptions and attitudes which are ‘regular’ without beingconsciously co-ordinated or governed by any ‘rule’” (Editor’s introduction,Bourdieu 1991: 12). The habitus, as the bodily counterpart of inner speech, isclearly connected to gestures, described by Vygotsky, as “the material carriersof thinking” (McNeill and Duncan 2000: 155). Hence, as McNeill suggestedin his discussion of growth point, inner speech and habitus form the dialecticalunity of mind and body.

Of the various features of the habitus, its durability is most relevant toour present concern. Given that the habitus, again parallel to inner speech,develops through a gradual inculcation process “in which early childhoodexperiences are particularly important”, its dispositions are “ingrained inthe body in such a way that they endure through the life history of theindividual, operating in a way that is pre-conscious and hence not readilyamenable to conscious reflection and modification” (Bourdieu 1991: 13).Finally, the habitus, like inner speech, orients individuals toward certainperceptions, attitudes, actions and inclinations, but it does not strictlydetermine these.

Thus, from the perspective of inner speech/habitus, as the participantsin our study scrutinized the events in the frog story, their perception of themotion depicted in the story was oriented by their internalized conceptualmeanings that had been inculcated during the early years of their ontogeneticformation as they were drawn into the cultural practices of their respective



speech communities. For the L1 English speakers, the manner in which themotion events unfolded was consequently brought to the fore, while for theL1 Spanish speakers, path of motion and physical setting were perceivedas salient as the events unfolded. In other words, as the participants wereshown the sequence of transparencies, they were simultaneously making senseof (i.e. experiencing) them as a coherent narrative through the mediationalmeans of their inner speech and habitus,11 which in our view, means they wereengaged in language-based activity only in the psychological (intrapersonal)rather than the social (interpersonal) domain. It might seem that there is noreal difference between our position and Slobin’s; however, we think thereis, and we believe it will show up in a task which asks participants to drawthe events of a visually presented story. If the anticipatory effects hypothesisis correct, we would expect greater similarity between the drawings of L1Spanish speakers and L1 English speakers, since anticipating how they wouldtalk about the events later should not be an issue (unless “later” is morevague than Slobin seems to assume, as we suggested). If, on the other hand,our hypothesis is correct, then we would expect the drawings from eachgroup to highlight the same differences in motion events as emerged in theverbal narratives.

Given the durable nature of inner speech and habitus, we would expectshifts to L2 TFS to be rare, though not impossible, occurrences. We would,for instance, expect that those undergoing immersion experiences, as usually(though not always) happens in the case of immigrants to a new culture,would be under more pressure to adapt to their new circumstances and thusbe more likely to reorganize their inner speech/habitus than would classroomforeign language learners (see McCafferty and Ahmed 2000; Pavlenko andLantolf 2000). On the other hand, we do not want to claim that immersionnecessarily means the individual will succumb to the communicative pres-sures of the new culture. There are documented cases in the literature ofpeople resisting intrusion of new cultural patterns on their social and mentalactivity (Belz ms.; Siegal 1996; deBot and Hulsen 2000).

Finally, in light of the performance of our participants, we believe it isnecessary to reconsider the way in which manner verbs themselves havebeen categorized by linguistic as well as psycholinguistic researchers. Thatis, we found no instances in which L1 English speakers marked mannerthrough gesture with verbs such as jump or fall in English. With these verbsthey synchronized their gestures on path satellites and grounds, as in example(14), thus signaling that manner was not in focus. From a purely linguisticperspective, it can be assumed (as Talmy 2000 and Slobin 2003 indeed do,based on their respective lists of manner verbs) that verbs such as jump andfall conflate motion and manner; from the perspective of speaker performanceas revealed through gesture behavior, however, the speakers themselvesmay not feel that such manner verbs are sufficiently descriptive to capturethe nuances of manner events that as English speakers they have beenhabituated into expecting.



Thus, we think a distinction needs to be made between ‘basic mannerverbs’ such as jump, fall, run, throw, walk, climb, all of which have frequentlyoccurring equivalents in V-languages such as Spanish (saltar, caerse, correr,tirar, caminar, subir), and more nuanced verbs such as stomp, romp, trudge,stagger, swagger, sweep, etc. which do not. In other words, based on the gesturepatterns of our L1 English speakers and those in Kellerman and van Hoof’sstudy, speakers may no longer, if they ever did, conceptualize basic mannerverbs as conflating motion and manner, despite what linguistic analysismay assume. We realize that the number of speakers participating in bothgesture studies is small and that therefore, at this point, our proposal isspeculative. However, we think that it may be an area of fruitful futureresearch where the gesture–speech interface might be able to bring to lightimportant information on the way a speaker’s lexicon is organized.

Notes

1. We would like to thank Gale Stam, Susan Duncan, and David McNeill for theircomments on earlier versions of our manuscript and for their willingness todiscuss our research and offer suggestions. We express a special thanks to DavidMcNeill for making his lab available to us for carrying out the fine-grainedanalysis of our data.

2. Others refer to conventionalized gestures as ‘symbolic gestures’, ‘autonomousgestures’, or ‘semiotic gestures’ (see Krauss, Chen and Gottesman 2000: 262).

3. Krauss, Chen and Gottesman (2000: 263) refer to gesticulations as ‘lexical gestures’,while others have labeled them ‘representational gestures’, ‘ideational gestures’or ‘illustrators’.

4. Transcription conventions for gestures:[ ] encloses the stroke phase, or peak effort, in the gesture. The stroke phase is

obligatory as it expresses the meaning and focus of the gesture.bold letters indicate the precise point where the stroke of the gesture synchronizes

with speech. indicates the optional post-stroke hold phase of a gesture. It may be briefly

held until the hand moves back to a resting position./ indicates a silent pause, with multiple slashes (//) indicating extended pauses.

5. Example (10b) is very interesting, given that according to Talmy (2000: 28) it is aself-agentive motion verb with conflated manner, and therefore it should bepossible to produce (10b) with manner co-occurring in a swinging gesture. Inour view, this is not permissible in English. Things would be different, however,if the sentence were intended to indicate not a swinging motion but an up-and-down movement. In this case the expected gesture would be a repeated up-and-down movement of the hand or arm. Thus, it seems that gesture adds anadditional dimension to Talmy’s taxonomy of motion verbs in which some verbs,at least, may conflate with manner for some movements but not for others. Thisis parallel to the case of path conflation for the verb climb, as already discussed.Spanish-speaking learners of English should have a particularly difficult timeencoding the appropriate meaning in such cases.



6. The average speaker, however, is aware of emblem gestures and in fact oftenbelieves these are the only types of gestures people deploy.

7. For research on the appropriation of L2 gestures that does not focus especiallyon the speech–gesture interface, consult the work of McCafferty (1998, 2002) andMcCafferty and Ahmed (2000).

8. Some, including an anonymous reviewer of an earlier version of this article, mayquestion whether grab is indeed a manner verb. According to Talmy’s (2000)new analysis of motion verbs, it seems clear that grab falls under his AgentiveMove + Manner category, as it is quite similar to verbs such as twist and pop.

9. The first task was very productive in eliciting a different type of gesture: itemgestures which are related to specific objects depicted in each frame. This type ofgesticulation was considerably more frequent in L2 speakers, where some ofthese gestures seem to emerge even before their lexical counterpart. One of thereasons for the abundance of this type of gesticulation may be the original designof the first part of the task (McNeill, personal communication to Negueruela).Since participants did not know what was coming in the next scene or whatwas germane to the story, they were forced to describe each scene in detail. Thisseems to be a key issue, and it supports’ McNeill growth point hypothesis inrelation to the connection between gesture and the emergence of meaning, sincein the co-construction task participants had to create and decide “on-line” on theimportant meanings for the story. This phenomenon is in itself of great interestand complexity and will be explored in detail in a forthcoming publication.

10. Here we also see an instance of a gesture that could well serve a (failed) lexicalsearch function, as suggested by Butterworth and colleagues (see McCafferty’sarticle in this issue).

11. Interestingly, as Appel and Lantolf (1994) show for both L1 and L2 speakersof a language, sometimes a task is sufficiently complex (e.g. recalling a high-information expository text) that individuals cannot make full sense of ituntil they externalize their inner speech as overt private speech; that is, theycommunicate with themselves with speech that has features of social speechbut which continues to serve a psychological function.

References

Appel, G. and J.P. Lantolf (1994) Speaking as mediation: a study of L1 and L2 textrecall tasks. The Modern Language Journal 78: 437–52.

Belz, J.A. (unpublished manuscript) Language learning in immigration: the literarytestimonies of Werner Lansburgh.

Bourdieu, P. (1991) Language and symbolic power. (Edited and introduced by J.B.Thompson.) Cambridge, MA: Harvard University Press.

deBot, K. and M. Hulsen (2000) Language attrition: tests, self-assessments andperceptions. Paper presented at the Conference on Bilingualism, Bristol Univer-sity, April.

Duncan, S. (unpublished manuscript) Co-expressivity of speech and gesture: mannerof motion in Spanish, English, and Chinese. University of Chicago.

Dushay, R.A. (1991) The association of gestures with speech: a reassessment. Ph.D. disser-tation, Columbia University, New York.



Frawley, W. and J.P. Lantolf (1985) Second language discourse: a Vygotskyanperspective. Applied Linguistics 6: 19–44.

Goldman-Eisler, F. (1968) Psycholinguistics: experiments in spontaneous speech. London:Academic Press.

Harris, R. (2002) The role of the language myth in the Western cultural tradition.In R. Harris, The language myth in Western culture. Richmond, Surrey: Curzon. 1–24.

Hopper, P. (1998) Emergent grammar. In M. Tomasello, The new psychology oflanguage. Cognitive and functional approaches to language structure. Mahwah, NJ:Erlbaum. 155–76.

Jiménez-Jiménez, A. and B. Centeno-Cortés (2002) Problem-solving tasks in a foreignlanguage: the importance of the L1 in private verbal thinking. Paper presentedat the Annual Conference of the American Association for Applied Linguistics,Salt Lake City, April.

Kellerman, E. and A-M. van Hoof (in press) Manual accents. IRAL 41.Krauss, R., Y. Chen and R.F. Gottesman (2000) Lexical gestures and lexical access:

a process model. In D. McNeill, Language and gesture. Cambridge University Press.261–83.

Leontiev, A.A. (1981) Psychology and the language learning process. Oxford: PergamonPress.

Luria, A.R. (1973) The working brain. New York: Basic Books.Mayer, M. (1969) Frog, where are you? New York: DIAL.— (1979) Frog goes to dinner. New York: DIAL.McCafferty, S.G. (1998) Nonverbal expression and L2 private speech. Applied Linguistics

19: 73–96.— and M.K. Ahmed (2000) The appropriation of gestures of the abstract by L2

learners. In J.P. Lantolf, Sociocultural theory and second language learning. OxfordUniversity Press. 199–218.

McNeill, D. (1992) Hand and mind. What gestures reveal about thought. University ofChicago Press.

— (2000a) Introduction. In D. McNeill, Language and gesture. Cambridge UniversityPress. 1–12.

— (ed.) (2000b) Language and gesture. Cambridge University Press.— (2002) Gesture and language dialectic. International Journal of Linguistics Acta

Linguistica Hafniensia 34: 7–37.— and S. Duncan (2000) Growth points in thinking for speaking. In D. McNeill,

Language and gesture. New York: Cambridge University Press. 141–61.Pavlenko, A. and J.P. Lantolf (2000) Second language learning as participation and

the (re)construction of selves. In J.P. Lantolf, Sociocultural theory and second languagelearning. Oxford University Press. 155–78.

Reddy, M.J. (1993) The conduit metaphor: a case of frame conflict in our languageabout language. In A. Ortony, Metaphor and thought? (2nd edition). CambridgeUniversity Press. 164–201.

Scollon, R. (2002) Mediated discourse. The nexus of practice. London: Routledge.Siegal, M. (1996) The role of learner subjectivity in second language sociolinguistic

competency: Western women learning Japanese. Applied Linguistics 17: 356–82.Slobin, D. (1996a) From ‘thought and language’ to ‘thinking for speaking’. In

S. Gumperz and S. Levinson, Rethinking linguistic relativity. Cambridge UniversityPress. 70–96.



— (1996b) Two ways to travel: verbs of motion in English and Spanish. In M. Shibataniand S. Thompson, Grammatical constructions: their form and meaning. Oxford:Clarendon Press. 195–219.

— (2003) Language and thought online: cognitive consequences of linguistic relativity.In D. Gentner and S. Goldin-Meadow, Language in mind: advances in the study oflanguage and thought. Cambridge, MA: MIT Press. 157–92.

Sokolov, I. (1972) Inner speech and thought. New York: Plenum.Stam, G. (2001) Gesture and second language acquisition. Paper presented at TESOL

Convention, St. Louis, Missouri, March.Talmy, L. (1985) Lexicalization patterns: semantic structure in lexical forms. In

T. Shopen, Language typology and syntactic descriptions. Vol. 3: Grammatical categoriesand the lexicon. Cambridge University Press. 57–149.

— (2000) Toward a cognitive semantics. Vol. II: Typology and process in concept structuring.Cambridge, MA: MIT Press.

Valsiner, J. (2001) Process structure of semiotic mediation in human development.Human Development 44: 84–97.

Vocate, D.R. (1994) Self-talk and inner speech: understanding the uniquely humanaspects of intrapersonal communication. In D.R. Vocate, Intrapersonal communica-tion. Different voices, different minds. Hillsdale, NJ: Erlbaum. 3–32.

Vygotsky, L.S. (1978) Mind in society. The development of higher psychological processes.Harvard University Press.

— (1987) The collected works of L.S. Vygotsky. Vol. 1: Problems of general psychology.(Including the volume Thinking and Speech). New York: Plenum Press.

[Received 19/12/02; revised 16/9/03]

Correspondence to:James P. LantolfCenter for Language Acquisition305 Sparks BuildingThe Pennsylvania State UniversityUniversity Park, PA [email protected]

APPENDIX

Below is a basic description of the events depicted in the transparencies used in ourstudy. These pictures were taken from Mercer Mayer’s (1979) Frog Goes to Dinner.

Frame 1: A young boy stands at a dresser, fixing his tie in the mirror. Behind himstand a dog, a frog and a turtle.

Frame 2: Split frame: In one half, the boy sits in a chair, petting the dog; his frog hascrawled into the pocket of his jacket. In the other half, the boy is leaving with hisfamily while the dog and turtle stay behind.

Frame 3: The boy, his sister, mother and father stand at the valet station of a “FancyRestaurant”.

Frame 4: The family sits at a table with menus; a waiter speaks to the table. The frogjumps towards a small band set up in the restaurant.



Frame 5: Split frame: Two frog’s legs are seen sticking out of the bell of the saxo-phone. In the other half, the saxophone player shakes the instrument upsidedown over his head.

Frame 6: The frog falls onto the saxophone player’s face and he falls back into adrum. The other band members become angry/amused.

Frame 7: The band players argue while the frog jumps into a salad carried by awaiter.

Frame 8: The salad is placed in front of an older woman, and she begins to eat it.Frame 9: The frog is discovered and jumps off the plate and towards the woman’s

face as the woman falls back in her chair. The frog jumps into a wine glass beingheld by a patron at the next table. The patron and his female friend do not appearto notice the frog.

Frame 10: The older woman complains to the waiter. The patron attempts to drinkfrom his glass, whereupon the frog kisses him on the nose.

Frame 11: The frog sits, waving, on another table as the patron and his female friendhelp one another out of the restaurant, looking horrified. The waiter approachesthe frog as if to catch it.

Frame 12: The waiter carries the frog upside down to the exit and the young boy/frog owner notices that the waiter has his frog.

Frame 13: Split scene: In one half, the boy approaches the waiter and pleads withhim. The boy’s family stands behind him. In the other half, the family is gettingthrown out of the restaurant.

Frame 14: The family is in the car. The parents and sister look angry but the youngboy looks happy to have his frog back.

Frame 15: The family arrives home and the boy is sent to his room as the dog andturtle watch.

Frame 16: In the boy’s bedroom, the boy and frog laugh with the dog and turtle.

The "private function" of gesture in second language speaking activity: a study of motion verbs and...

Documents

Transcript of The "private function" of gesture in second language speaking activity: a study of motion verbs and...