Differing expectations in the assessment of the speaking skills of ESOL learners 2006

17
This article was originally published in a journal published by Elsevier, and the attached copy is provided by Elsevier for the author’s benefit and for the benefit of the author’s institution, for non-commercial research and educational use including without limitation use in instruction at your institution, sending it to specific colleagues that you know, and providing a copy to your institution’s administrator. All other uses, reproduction and distribution, including without limitation commercial reprints, selling or licensing copies or access, or posting on open internet sites, your personal or institution’s website or repository, are prohibited. For exceptions, permission may be sought for such use through Elsevier’s permissions site at: http://www.elsevier.com/locate/permissionusematerial

Transcript of Differing expectations in the assessment of the speaking skills of ESOL learners 2006

This article was originally published in a journal published byElsevier, and the attached copy is provided by Elsevier for the

author’s benefit and for the benefit of the author’s institution, fornon-commercial research and educational use including without

limitation use in instruction at your institution, sending it to specificcolleagues that you know, and providing a copy to your institution’s

administrator.

All other uses, reproduction and distribution, including withoutlimitation commercial reprints, selling or licensing copies or access,

or posting on open internet sites, your personal or institution’swebsite or repository, are prohibited. For exceptions, permission

may be sought for such use through Elsevier’s permissions site at:

http://www.elsevier.com/locate/permissionusematerial

Autho

r's

pers

onal

co

py

Linguistics and Education 17 (2006) 40–55

Differing expectations in the assessment ofthe speaking skills of ESOL learners

James Simpson ∗University of Leeds, School of Education, Leeds LS2 9JT, UK

Abstract

This is a study of the assessment of the speaking skills of adult learners of English for speakers of otherlanguages (ESOL). It is prompted by a concern that participants can have differing expectations of whatnature of speech event a speaking test actually is. This concern was identified during the administration andanalysis of assessments carried out as part of a study into adult ESOL pedagogy in the UK. The paper employsthe notions of knowledge schema and frame in discourse to draw together areas of interest in testing: whetherthe speaking assessment should be viewed as an interview or as a conversation; divergent interpretationsof the test event by learners; and variation in interlocutor behaviour. Implications for testing the speakingskills of adult ESOL learners are discussed; these are pertinent at a time of large-scale high-stakes testingof learners who are migrants to English speaking countries.© 2006 Elsevier Inc. All rights reserved.

Keywords: ESOL; Speaking; Testing; Assessment; Schema; Frame

1. Introduction

This paper is an investigation of the assessment of the speaking skills of adult learners of Englishfor speakers of other languages (ESOL). It is based on an analysis of recordings of tests collectedas part of the NRDC ESOL Effective Practice Project (EEPP), a major study of Adult ESOLpedagogy in the UK, carried out between 2003 and 2006. This paper is informed by conversationanalysis (Sacks, Schegloff, & Jefferson, 1974), by work in the tradition of the ethnography ofspeaking (Hymes, 1962), and by the notion of frame in discourse (Goffman, 1974; Tannen,1993; Tannen & Wallat, 1993), and treats the oral interview speech event as a jointly constructedaccomplishment. It draws together related areas of concern for language testers by addressing

∗ Tel.: +44 113 343 4687; fax: +44 113 343 4641.E-mail address: [email protected].

0898-5898/$ – see front matter © 2006 Elsevier Inc. All rights reserved.doi:10.1016/j.linged.2006.08.007

Autho

r's

pers

onal

co

py

J. Simpson / Linguistics and Education 17 (2006) 40–55 41

questions which were raised during the administration of the EEPP assessment: What sort ofspeech event is a speaking assessment actually trying to be or trying to mirror? What construct isit testing? How can one account for learners’ and testers’ widely differing linguistic behavioursduring the assessment event? These questions are taken up with reference to learners of Englishwho are migrants to English-speaking countries, and often refugees and asylum seekers. Suchlearners need to engage with English language tests for a variety of purposes. For example, adultmigrants arriving in the UK with little or no English and enrolling on a course of ESOL study willundergo a series of externally accredited language assessments from the very first level, and laterprogression depends on success in such tests. Learners who seek employment are also obligedby many employers to demonstrate through language qualifications that they have competencein English. And success in an English test is also used to satisfy the language requirement forcitizenship.

Many adult ESOL learners have low levels of educational attainment, and little experienceof school. It follows that they also have little previous experience of formal testing situations ofthe type which they may have to undergo in their current learning environments in the UK. Thegenesis of the current paper lies in my experience as a participant in a speaking test for suchlearners. This test is a paired format oral communication test administered to learners who wereparticipants on the EEPP to generate a ‘before and after’ measurement of progress. I describethe test more fully in Section 4 below. While carrying out the assessments, certain themes wouldrecurrently emerge concerning the test-taking experience of the learners on the project, particularlythose who had little or no schooling. These concerned the discourse processes at play whenthese learners underwent a formal speaking assessment, and whether there were identifiablefactors which hindered such test-takers’ performance. Subsequent investigation of the recordingsand transcripts of the tests revealed a varying ability amongst learners to deal with the tasksin the different parts of the test; more tellingly, they revealed patterns of differing expectationsof the test event itself, among both test-takers and their interlocutors. This paper is devotedto an examination of these differing expectations. The analysis also extends to the questionof how such expectations bear upon the learners’ ability to achieve a performance in the testfrom which inferences can be drawn which honestly reflect their level of competence in spokenEnglish.

The paper is organised into six further sections. The next section covers the theoretical frame-work on which I draw to inform the analysis: the notions of structure of expectation, schema, andframe in discourse. Section 3 engages with the debate about what type of speech event a speakingtest is modelled upon, an interview or a conversation. Sections 4 and 5 describe first the EEPPtest and then the participants in the test. Section 6 contains analyses and discussion of extracts ofthe test, and the concluding section draws implications for testing the speaking skills of low-leveladult ESOL learners.

2. Structures of expectation

The term structure of expectation refers both to the background knowledge which peoplebring to a speech event such as a speaking test and how, according to the background knowledgethey possess, they position themselves linguistically during the unfolding of the event. Struc-tures of expectation, in Tannen’s words (1993:21) ‘. . . make it possible to perceive and interpretobjects and events in the world . . . [and to] shape those perceptions to the model of the worldprovided by them’. The principal structures of expectation which are relevant to this paper areknowledge schemas and interactive frames. The first of these are relatively stable psychologi-

Autho

r's

pers

onal

co

py

42 J. Simpson / Linguistics and Education 17 (2006) 40–55

cal structures of background knowledge which participants bring to a speech event; the secondare the essentially dynamic instant-to-instant linguistic positioning of participants in the inter-action; these are themselves informed by the knowledge schema participants associate with thatinteraction.

The most commonly used term for the storage and utilisation of background knowledge isschema (Bartlett, 1932). A schema is the mental representation of a typical instance which, asCook (1994:11) explains, is: ‘used in discourse processing to predict and make sense of the par-ticular instance which the discourse describes’. Schematic knowledge as defined by Widdowsonis ‘common knowledge of shared experience and conventionally sanctioned reality’ (1990:102),and is invoked when it is necessary to refer to ‘. . . more general and conventional assumptionsand beliefs which define what is accepted as normal or typical in respect of the way reality isstructured and to the conduct of social life’ (ibid.). Schemas are not essentially static notions;they are shaped by the ongoing experience of interaction in particular areas. Moreover, withoutexperience of an instance of a particular speech event—such as a spoken language test—it cannotbe supposed that an individual possesses a well-developed schema for that event.

The term frame is used here in the sense developed by Tannen and Wallat (1993) as interactiveframe. This refers to the participants’ sense of what is going on at the moment, and is similar tospeech activity in Gumperz’ (1977) use of the term, for example, chatting, explaining, interviewingor testing. According to Tannen and Wallet it also reflects Goffman’s (1981) notion of footing:‘the alignment participants take up to themselves and others in the situation’ (1993:73). Theinter-relationship between knowledge schemas and interactive frames is central to the analysisundertaken by Tannen and Wallet, who claim that ‘a mismatch in schemas triggers a shifting offrames’. In this current analysis, I am concerned with how surface linguistic features can reveal avariance of expectations of the speech event called a speaking test. The knowledge schemas whichboth the learners and the interlocutor bring into the event, i.e. which constitute their expectations,will affect the alignment they take up, i.e. the frames which are brought about as the interactionproceeds.

In the case of the present study, an a priori assumption is made that the learners involved,because of their lack of educational background, do not have well-developed knowledge schemasfor the speaking test event, and thus have a limited understanding of its aim of producing a largeand extensive enough sample of language by which to assess the learner. This is partly due to themore general difficulty of being clear about what a speaking test is designed to test (see Section 3below). Interviews with learners themselves reveal a lack of understanding of the purpose of thetest. In this quotation taken from an interview for the ESOL effective practice project (see Cooke,this volume) a learner in an Entry Level 1 (i.e. beginners) ESOL class in the North of Englandmentions the speaking test she has recently undergone:

When I did the interview with Mr John and Mr James they asked, ‘Do you like English’? Isaid yes. They asked why. A strange question. You need it when you go out.

(Translated from Arabic; words originally spoken in English are underlined)

The learner uses the word interview in preference to test, and reports that she takes the intentionof the question why? at face value (‘A strange question’). It can thus already be seen that ESOLlearners with little educational background may be approaching the speaking test event withoutknowledge schemas founded upon prior experience of language assessment of this kind, and donot filter the discourse through the lens of such experience. The matter of differing responses topedagogical events is one addressed by the psychologist Luria (Vocate, 1987): schooled respon-dents may be able to recognise and perform within the studied artificiality of the pedagogical

Autho

r's

pers

onal

co

py

J. Simpson / Linguistics and Education 17 (2006) 40–55 43

or (in this case) the testing event; unschooled respondents may take matters more at face value,unable through lack of experience to fully understand the pedagogical or testing aspect of theinteraction.

3. Speaking tests as conversations or interviews

A fundamental question for language testers involves identifying exactly what a particular testis testing: what is the construct? As it relates to speaking tests for low-level learners this questioncan be re-formed in a number of ways. Here I pose two related questions: (1) If an assessmentis testing conversation, how conversation-like is it? And (2) if it is not testing conversation, thenwhat is it testing?

It is often assumed that in a formal assessment of speaking, the construct being tested isconversation, viewed by many as the prototypical and basic form of oral interaction. Yet anexamination of conversation, and a comparison with speaking test discourse, suggests that theyare quite different in nature. Spoken casual conversation is defined by Eggins and Slade (1997:19)as talk which, in contrast to other speech events such as interviews and service encounters, isnot motivated by any clear pragmatic purpose. Within this definition, Eggins and Slade positcertain differences between casual conversation and pragmatically oriented interaction: in termsof number of participants (often there are more than two people in a conversation); whether ornot a pragmatic goal is achieved (this is not the aim of casual conversation); length (pragmaticinteractions tend to be short); and level of politeness and formality (casual conversation oftendisplays informality and humour). By these criteria at least, formal speaking assessments such asthe one discussed here are clearly not conversations. But how conversation-like are they? This is aquestion which has exercised testers of spoken language since the publication of van Lier’s classicpaper (1989) in which he questioned the extent to which an oral proficiency interview (OPI) wasactually an example of conversational language use (van Lier, 1989:495). van Lier’s analysis oflanguage test data claimed to demonstrate that it was not conversation-like; rather it exhibitedmany of the features of formal interviews, for example asymmetry and interviewer control.

Such a position tends to set the interview in dichotomous opposition to conversations, under-playing the fact that question and answer formats exist in conversation as well as in interviews.Yet the predominantly asymmetrical nature of exchanges within the speaking test event is unde-niable; studies following van Lier’s work by Lazaraton (1992), Riggenbach (1991), and Rossand Berwick (1992) all demonstrate that the majority of interaction in OPIs consists of ques-tions being asked by the interlocutor and answers being provided by the student. Johnson andTyler’s (1998) analysis of a model OPI used for training shows it lacks the patterns of turn-takingand topic initiation characteristic of naturally occurring conversation, and lacks conversationalinvolvement and typical shows of responsiveness. These authors conclude that ‘the face-to-faceexchange that occurs in the OPI interview cannot be considered a valid example of a typical, reallife conversation’ (1998:28).

The OPI is only one kind of speaking test. In response to concerns over the asymmetrical natureof participation in the OPI, testing organisations have developed other test formats which attemptto elicit a range of responses beyond the dyadic question and answer of the OPI. These includethe paired format test developed by Cambridge ESOL (Taylor, 2003) and adapted for the EEPP, asdiscussed in Section 4 below. At higher levels in particular, this allows for an extended long turnand peer discussion. Even so, in their review of research methods in language testing, Lumleyand Brown summarise the general findings of discourse analysis approaches to the question ofspeaking assessment (2005:842):

Autho

r's

pers

onal

co

py

44 J. Simpson / Linguistics and Education 17 (2006) 40–55

A general consensus has emerged . . . that although oral interaction shares some featureswith nontest conversation, it is essentially a distinct and institutional form of interaction,and that because of its nonsymmetrical nature, it is limited in the extent to which it canprovide an indication of nontest conversational performance.

There remains the question posed earlier: if the speaking assessment is not testing conversa-tional performance, then what is it testing? In their asymmetry, power discrepancy, and inbuiltpragmatic intention, many if not most formal speaking assessments, including the EEPP test, cor-respond to a lay definition of interviews, and can be viewed as interview-like events. This wouldnot invalidate the test as a measure of communicative competence, argue Moder and Halleck(1998:118): ‘. . . informal conversation is only one type of speech event in which communicationtakes place, and is probably no more representative of the events in which non-native speakersparticipate than other speech events’. Testing the ability to interact in an interview situation is notan unrealistic or irrelevant aim: in the course of their day-to-day lives, ESOL learners such as theparticipants in this study face institutional interactions where similar power relations exist, andbeing able to interact appropriately in such interactions is a vital skill. Yet despite the argumentof Moder and Halleck, what speaking assessments cannot generally claim is to be able to test theentire complex construct of spoken communicative competence.

Behind the naive idea that the intention of a speaking assessment is to test conversationalskills, lies an often unstated assumption that the fundamental aim of the test is to representa student’s communicative competence. This, says Widdowson (2003), reduces the notion ofcommunicative competence to the ability of a speaker to hold a conversation, whereas speakingof course involves much more than that. The same could be said for a view in which the speakingtest is testing the ability to perform in an interview as an emblem of communicative competence.Widdowson (2003:169) argues that the problem of testing communicative competence per selies with the definition and operationalisation of communicative competence itself. He examinesHymes’ (1972), Canale and Swain’s (1980) and Bachman’s (1990) accounts of communicativecompetence, supposing that: ‘as one deconstructs these models, one begins to have doubts as towhether any model of communicative competence can be made pedagogically operational as aframework for language testing’. He goes further, later suggesting that ‘. . . communicative testsare impossible in principle, which is why it is not surprising that they have proved so difficult todesign, that you just cannot test the ability to communicate and so it is pointless to try’. (2003:171).In this view, whatever limited range of speech activities the test event reproduces, simulates orrepresents, it can only hope to test a very limited part of the fragmented idea of communicativecompetence.

The above discussion suggests that speaking assessments for low-level learners can, indeedshould, be analysed as an interview. This view is supported by Moder and Halleck, who contendthat the use of questions: ‘clearly reflects one of the elements of the interview frame: it is therole of the interviewer to ask questions and the role of the interviewee to answer them adequatelyand appropriately’ (1998:121). Furthermore, in terms of procedure the speaking test may be moreinterview-like than conversation-like. One participant in the test knows what is going to happenin advance—what questions will be asked, and so on; the other does not. But this generalisationneglects to address the difficulty raised by the use of the term interview by the learner who wasquoted at the end of Section 2. That is to say, the pragmatic intention of a speaking test may bevery different to that of another type of interview, for example with a prospective employer, ahousing officer or an immigration official. This question requires an engagement with the deeperdebate concerning authenticity and artifice in communicative language teaching (see Widdowson,

Autho

r's

pers

onal

co

py

J. Simpson / Linguistics and Education 17 (2006) 40–55 45

1978; van Lier, 1996; Cook, 2000). In the case of language testing, activities which might becharacterised as ‘unreal’ by one participant, the tester, may well be seen as ‘real’, i.e. not part ofan activity designed to elicit a sample of language for the purpose of assessment, by the personbeing tested.

In this section I have discussed the nature of the speaking test as conversation-like or interview-like from something of an ‘outside’ perspective. The vital question for the remainder of the paper ishow the participants in the interaction view the speaking test event: not so much in terms of whetherit is a conversation or an interview, but rather, whether it is in fact a test, and recognisable throughprior experience of such events, or whether it belongs to some other category of interactionalevent. What knowledge schemas are brought in, and what frames are brought about, when thevery nature of the test as a test is in question? I return to these matters below, after situating thedata by describing the test itself and the participants.

4. The test

The assessment under consideration was administered before and after an observational cycleof between three and nine months to all the students in each class involved in the NRDC ESOLEffective Practice Project (EEPP), in an attempt to gauge progress in learners’ levels of speakingability. The test was adapted from the speaking paper of the Key English Test (KET), the lowestin level of the examinations in the main suite of the testing organisation Cambridge ESOL,which is plotted against the Common European Framework of Reference for Languages (CEFR)at A2 or Waystage level (van Eck & Alexander, 1977). It was developed in partnership withCambridge ESOL; this process is described in Simpson (2005). It is a paired format test, thatis, the students are tested in pairs by two testers: an interactant known in testing circles as theinterlocutor, and an assessor. Paired format testing comprises part of the ‘Cambridge approach’to testing (Taylor, 2003); the pairing of candidates, in the Cambridge view, helps to make theinteraction more authentic, thus maintaining validity, by allowing for interaction to take placebetween the candidates as well as between candidate and interlocutor. As discussed in Section3 above, paired format testing was developed in response to contentions that the discourse oftraditional interview-format speaking assessments is asymmetrical in comparison to prototypicalconversational discourse. It has received some recent attention in the literature on spoken languagetesting (e.g. Norton, 2005; Galaczi, 2003). The test under discussion here has two versions: onefor lower level and one for higher-level learners. The lower level version comprises an extendedPart 1 of the KET, and was administered to the learners on the project at the very lowest levels.The higher-level test is very close in format to the KET itself, and uses the same marking scheme.Discussion and analysis in this paper is restricted to administrations of the higher-level test. Thistest, like the KET, is in two parts: in part one (5–6 min) each learner interacts with the interlocutor‘giving factual information of a personal kind, for example name, place of origin, occupation,family, etc’. (University of Cambridge ESOL Examinations KET handbook, online). In the secondpart the two, or sometimes three test-takers are supposed to interact with each other: ‘Prompt cardsare used to stimulate questions and answers which will be related to daily life, leisure activities andsocial life’ (KET handbook, online). The test was adapted to take into account the backgrounds ofthe learners involved. For example, the question Why did you come to the UK? which appears inthe KET interlocutor script was omitted: ESOL learners who are also refugees or asylum seekershave often undergone other interviews where the illocutionary force of such questions may well beto get information, but the stance of the interviewer is far from neutral. An individual’s continuedfreedom or presence in the UK could depend on the answer. Prompts in the second part of the test

Autho

r's

pers

onal

co

py

46 J. Simpson / Linguistics and Education 17 (2006) 40–55

were based on the personal lives of candidates, and were taken from retired forms of the KET.This second part still involves a question and answer format, but between the two test-takers ratherthan between test-taker and interlocutor.

The test was administered with relatively little preparation on the part of the learners concerned:they were not trained in test-taking techniques beyond a short briefing and practice. The inter-viewers themselves, although having undergone training and standardisation, were not languagetesting professionals. Thus neither the learners concerned nor their interlocutors were bringing tothe testing event a very well established knowledge schema of the communicative event, at leastinitially. The question, therefore, is how the flexibility implied by an under-developed knowledgeschema plays itself out in the interactive frame development of the event itself.

The issue of training and standardisation of interlocutors and assessors in the testing processis one which exercises testing organisations greatly. At the heart of the matter is the familiartension between reliability and validity. The Cambridge approach to testing upon which this testwas founded stresses the importance of reliability; this is unsurprising given the scale and globalscope of the Cambridge exams. It also echoes the general importance of reliability in any testthat is to be used in a high-stakes decision-making process such as immigration. Regarding theCambridge view, Taylor (2003:2) states:

It is principles of good measurement which determine . . . key features such as:

• the pairing of examiners (with one acting as participant-interlocutor and one as observer-assessor—but both providing an assessment of performance, i.e. multiple observations);

• the use of an interlocutor frame (to guide the management of the test and to ensure thatall candidates receive similar, standardised input in terms of test format and timing;

• the implementation of a comprehensive oral examiner training/standardisation pro-gramme (to increase the reliability of subjectively judged ratings and provide a commonstandard and meaning for such judgments. . .).

Ross, however, reminds us that reliability might come at a price (1998:335):

Standardization of the interview and the rating criteria process may increase the reliabilityof the procedure at the expense of the generalizability of the ratings – a problem thatpotentially threatens the content validity of the procedure and the predictive validity of thelevel awarded to the candidate. The effect of standardization of interview protocols mayportend serious consequences in other forms of diagnostic interviews as well, and may infact influence the interaction and discourse in ways that are detrimental to the candidate.

Because interlocutors are human and fallible, some variability in interlocutor behaviour willoccur regardless of the level of training and standardisation of behaviour, hence the tensionbetween validity and reliability is ultimately irresolvable. In the case of the test under discussion,and despite the fact that the project team was trained and their marking was standardised, thereremained an amount of variability in interlocutor behaviour. There are differing views as to whetherthis should be viewed as a problem or not, as suggested by the differing positions of Taylor (2003)and Ross (1998). Lazaraton, in her research into the Cambridge speaking tests, maintains thatwhatever the pros and cons of interlocutor variability, it does have an effect. She concludes asection on the analysis of interlocutor behaviour in a number of Cambridge ESOL examinations,including KET (2002:151–152):

Autho

r's

pers

onal

co

py

J. Simpson / Linguistics and Education 17 (2006) 40–55 47

Variability in behaviour is frequent . . .. Using an interlocutor frame, monitoring interlocutorbehaviour, and training examiners thoroughly are all ways to reduce, or at least control, thisvariability. It is unlikely, however, that it would be possible, or even desirable, to eradicatethe behaviour entirely, since ‘the examiner factor’ is the most important characteristic thatdistinguishes face-to-face speaking tests from their tape-mediated counterparts. Yet, weshould be concerned if that factor decreases test reliability, even if it appears to increase theface validity of the assessment procedure.

The interactional nature of the speaking assessment has led testing professionals to speculate onthe effect of differing interlocutor behaviours on test-takers’ performance (Brown, 2003; Brown& Lumley, 1997; Reed & Halleck, 1997). In this paper I regard differing interlocutor behaviours,particularly when they are manifested as deviation from the interlocutor’s script, as instanceswhich suggest that their interaction may be according to a knowledge schema which does notconform to the expectations of a speaking test. This matter is explored further in Section 6 below.

5. Participants

More than 400 learners were assessed for the ESOL Effective Practice Project. From these, asample was chosen of learners with low levels of education, and by implication little experienceof formal assessment, who nonetheless underwent the higher-level assessment. To do this, Ifirst isolated the recordings of those learners who reported little or no schooling in their homecountries but who did the higher level test; there were only 23 of these, as most learners onthe project who reported little or no schooling were in low level ESOL classes and underwentthe lower level assessment, where they spoke only to the interlocutor in basic information andinteraction routines. The purpose of this sampling reflects an interest outlined in Section 2 aboveconcerning schooled and non-schooled responses to pedagogical events.

Table 1 shows the learners whose assessments we are discussing in this paper whose interactionsI looked at in depth. The learners’ details are presented here in the order in which their contributionsappear in the Extract.

Within the group there is a range of schooling (0–4 years); moreover it is not clear whatcorrespondence there might be between schooling in the learners’ home countries. That is to say,is one year’s schooling in Somalia equivalent to one year’s schooling in Vietnam? In addition,although the learners may have had little or no schooling, and hence possess an underdevelopedknowledge schema for the testing event, they may be acculturated to the norms of educationalsystems generally to the extent that they possess a notion of correctness.

In the extracts which follow, the learners spoke to interlocutors who were members of theproject team. In the case of Hanan (Extract 1), Tam (Extracts 2 and 3) and Shafiqa (Extract 4),

Table 1Learners in the study: A purposive samplea

Learner F/M Age First (expert) language Years in UK Years schooling

Hanan F 40–49 Somali 2.5 2Tam F 40–49 Vietnamese 5 1Shafiqa F 20–29 Punjabi 2 4Rosa F 16–19 Tamil <1 0Mohammed M 16–19 Somali 1 3

a All names of learners in this study are pseudonyms.

Autho

r's

pers

onal

co

py

48 J. Simpson / Linguistics and Education 17 (2006) 40–55

the interlocutor was male; and, in the case of Rosa and Mohammed (Extract 5), the interlocutorwas female.

This is only a small purposive sample, yet through investigation of these learners’ interactionswith their partners and the interlocutors in the test event, some interesting patterns emerge whichare not atypical of the testing experience of ESOL learners in the effective practice project.

6. Analysis

The first step in the analysis was to bring to the data a very broad question: What happens whenlearners with little or no experience of formal schooling carry out the speaking assessment? Thestudy thus takes a modified conversation analysis approach (Lazaraton, 2002:76–77) in that ratherthan taking any stretch of language test data and seeing what emerged (unmotivated looking), Iacted on the hunch of that preliminary guiding question. On the basis of this ‘motivated looking’(contra Sacks, 1984, cited in Lazaraton, 2002:76), I isolated the test transcripts for analysis.In this section, four particular issues are discussed with reference to the analysis of illustrativeexamples. These issues are: whether the test event is viewed by the learners involved as a testor as a different type of speech event; the case of under-elaboration within the test event; theemployment by learners of cooperative strategies to render the event more successful in termsof language produced; and the divergence of the interlocutor from the test script, understood asindicating a stance towards the testing frame whereby the interlocutor is not entirely committedto treating the event as a test.

6.1. A test or not a test?

The first extract of data exemplifies a common pattern in the EEPP tests. The interlocutorbehaves according to the requirements of the speaking assessment and the training procedure bydeviating only minimally from the interlocutor script, while the test-taker with limited experienceof the assessment experience is addressing the questions she is being asked as if they are part ofa conversational exchange. We see an instance here where the interlocutor stays with the script,generating turns, which would be very out-of-place in a conversation.

(The transcription conventions for this paper, and others, are listed later in this special issue).

Extract 1. Hanan (H), talking to the interlocutor (I) in Part 1 of the assessment.

Autho

r's

pers

onal

co

py

J. Simpson / Linguistics and Education 17 (2006) 40–55 49

In turn 9 the interlocutor does not ask ‘what sort of work’? but instead cuts the topic short with‘OK’; in turn 11 the pause after ‘thank you’ signals a very abrupt topic boundary. The repetitionof the learner’s name in turn 11 also indicates adherence to a script, and hence to a testing frame,rather than engagement with conversational discourse. Hanan on the other hand responds to theinterlocutor’s elicitation (turns 5 and 6) with an elaboration which might be seen in either a testingevent, a conversation, or another type of interview; she could be in any frame in which in her viewwould require a full response to a question addressed to her. Constraining the interlocutor by theuse of a script has the result of not allowing Hanan to expand her responses to the extent that sheis potentially able.

6.2. Under-elaboration

In other cases learners say very little in the test itself, but when it is over they become volubleand expansive, presenting the assessors with a long turn. This is the case with Extracts 2 and 3(below). It is possible that the learner, Tam, brings to the test a knowledge schema of a formalspeaking experience where she is not expected to respond at length. Thus rather than producinga substantial sample of language which can form the basis of assessment, it is only when the testis over that a conversational frame can be brought about.

Extract 2. Tam (T), talking to the interlocutor (I) in Part 1.

Note the instances where the interlocutor has to use back-up cues (part of the interlocutorscript) to elicit a response from Tam. This reticence on the part of the learner is in sharp contrastto the episode immediately following the test (Extract 3).

Autho

r's

pers

onal

co

py

50 J. Simpson / Linguistics and Education 17 (2006) 40–55

Extract 3. Tam (T), talking to the interlocutor (I) and the assessor (A) after the test.

It is clear that she is quite able to embark on a long turn, trying to get across a complexmessage, but did not do so in the test itself. The knowledge schema which she brings to theinteraction requires her to produce minimal responses until the formal aspect of the test iscomplete (signalled in Extract 3, turns 1 and 2). At that point, outside the bounds of the test, andwhen all participants are engaged in an informal ‘chatting’ frame, she is able to produce the longturns (Extract 3, turns 5 and 9).

What possible factors account for this contrast? Ross (1998) calls the phenomenon of sayingvery little in the test itself under-elaboration. There are a number of reasons why students such asTam under-elaborate, some of which I note here. The first two are suggested by Ross (1998:345).(1) Learners might not possess the pragmatic competence (Bachman, 1990) to tackle or answerthe question. (2) Under-elaborate answers might ‘mark the boundaries of what are considered bythe candidate as private matters’. (3) Though her schema for a speaking test may not match that ofthe test designers’ expectation, like most learners Tam possesses a notion of correctness. Despiteher lack of schooling she comes to the test with a knowledge schema, possibly based on someknowledge of the overall dominant educational culture in Vietnam or of her prior learning experi-ence in the UK, that encourages her to focus on getting the answer right rather than demonstratingher range of ability at the risk of making mistakes. This, coupled with her lack of experience of theformal testing situation, may have prompted her to feel that it is better to say little in the test itselfthan to produce incorrect utterances. (4) Speaking test candidates in general are undoubtedlyunder an amount of what Brown and Yule (1983:34) term communicative stress, where they are

Autho

r's

pers

onal

co

py

J. Simpson / Linguistics and Education 17 (2006) 40–55 51

in the presence of unfamiliar listeners, where it is not entirely clear what they are expected toproduce in terms of length and complexity of utterance. (5) With the shift from ‘test/interview’ to‘conversation’ comes a corresponding shift in the power relations between Tam and the testers.When Tam no longer feels that she is the subordinate partner in an unequal interaction, she is ableto expand her responses. (6) Another source of perceived inequality could be gender. Both theinterlocutor and the assessor, as well as the second student in the test, were male. (7) Tam’s testexperience may exemplify the different cultural expectations of conversational style. The insightfrom the ethnography of speaking is that descriptors in a speaking test may well describe aspectsof conversational style valued by a particular speech community and not by others (Hymes, 1974).As Young (1995:6) summarises: ‘. . . viewed from the perspective of the ethnography of speaking,it is clear that the descriptions of speaking in LPI [Language Proficiency Interview] rating scalesare, in effect, summaries of features of conversational style that are considered desirable bynative speakers of English’. There remains therefore a norm in such tests based on a model ofcommunication in English-dominant countries, which all learners are obliged to aspire to.

What emerges from this complex range of possibilities is that conclusions drawn from anyhesitations and minimal responses on the part of the learners in the test risk conflating lack ofability with a number of other potential accounts.

6.3. Shared frame interpretations and cooperation

Shared frame interpretations, where both test-takers and the interlocutor all understand thepurpose of the test, including the need to produce an assessable quantity of language, can createa generally beneficial effect on the test outcome, where co-construction of turns by the otherparticipants in the testing event give one the sense that ‘we are all in this together: you are havingtrouble in your test and we’re here to help’. In this next extract the turns are co-constructed byShafiqa, who is supposed to be asking the question but has run into difficulties, and Lubna, theaddressee.

Extract 4. Shafiqa (S), talking to Lubna (L), in Part 2.

Lubna, a more educationally experienced student, realises Shafiqa is experiencing problemsand gives her quiet hints at the first two arrowed turns (6 and 10). At turn 12 she overlaps withShafiqa at the beginning of her answer, almost as if she wants to help her classmate to completeher turn as quickly as possible.

Such cooperative and supportive behaviour suggests solidarity between the students in theface of a difficult experience. This would certainly accord with the learners’ behaviour in thisparticular class, where they are a very tight-knit group, sharing a similar cultural background,

Autho

r's

pers

onal

co

py

52 J. Simpson / Linguistics and Education 17 (2006) 40–55

and with social networks in common beyond the classroom. It also demonstrates the advantageof paired format testing: Although the task requires Shafiqa to ask Lubna questions, neither shenor Lubna are constrained by an interlocutor frame and are able to embark on the co-constructionseen in Extract 4. More broadly, some research into pairings in paired format tests suggests thatcandidates in such tests should be allowed to select their own partner to reduce anxiety levels(Norton, 2005:292).

6.4. Interlocutor frames

Finally, we turn to the different frames which are brought about when interlocutors bringknowledge schemas different from testing schemas to the speaking test event. Many instanceswhen interlocutors diverge from the script involve accommodation to the learners’ levels or theuse of ‘learnerese’ (Ross & Berwick, 1992) and could be interpreted as a teaching frame. Extract 5shows an example where the interlocutor’s expectation of the speaking test event does not conformto the speaking test schema; hence the frames which she brings about through the interaction shiftfrom ‘testing’ to ‘teaching’ to ‘chatting’, as we see below.

Extract 5. Rosa (R), talking to Mohammed (M) in Part 2; interlocutor (I).

Autho

r's

pers

onal

co

py

J. Simpson / Linguistics and Education 17 (2006) 40–55 53

In turn 1 the interlocutor is within a ‘testing’ frame, and introduces the task to be performedwith words from the interlocutor’s script. In turns 2 to 8 she is in a ‘teaching’ frame: she sees thelearner Rosa having difficulties forming the questions required, but instead of allowing silenceto reign until an utterance emerges or relying on the second learner’s support, herself helps thelearner. A strong interpretation of such assistance would suggest that the interlocutor has notaccepted the nature of the communicative event as a language test, and is assisting the learneras one might expect a teacher to do. This impression is strengthened by an examination of turns15, 18, and 20, where the nature of the support given by the interlocutor again diverges from thetesting frame. In turn 26 the interlocutor draws the test to a close, using the phrase ‘that’s the end ofthe conversation’. Thus she is again in the test frame, but her uncertainty about the event is givenaway by her use of the word ‘conversation’. The interlocutor in this extract noted afterwards thatthis may be seen as an example of test-giver stress, whereby the apparently powerful interviewerflounders if the test-taker is unsuccessful. One could assume that, judging from her responses,the learner Rosa finds the experience difficult: she herself does not possess the speaking testknowledge schema and can only respond to the frames which are shifting around her. It is clearthat the interlocutor’s deviation from the script does not help Rosa particularly; it seems thatexcessive and impromptu scaffolding can be problematic rather than supportive for test-takers.

7. Conclusion and implications for testing the speaking skills of adult ESOL learners

For the points raised in the discussion in this paper, there are implications for testing thespeaking skills of adult ESOL learners, particularly those of low-level learners. Such implicationsare pertinent at a time of wide scale and high-stakes testing of English for citizenship in English-dominant countries.

7.1. Implications of divergent frame interpretations

The obvious implication is that test-takers should be apprised of the test format and trainedthoroughly before embarking on the test. Given the high stakes nature of tests which are designedto satisfy a language requirement for naturalisation or citizenship, as well as the widespreadintroduction of national tests for all ESOL learners, test-taking training is likely to become anintegral part of ESOL lessons even at the very lowest levels. And if all candidates for the tests areworking within a common frame, validity is strengthened. If everyone—testers and test-takersalike—assigns the same pragmatic significance to the test event and the associated materials,the result will be heightened validity. Or put plainly, if everyone knows it is a test and behavesaccordingly, then the test has both increased reliability and validity as a test (not a conversation,a learning episode or another type of interview). Yet many learners in low-level ESOL classeshave not had access to basic schooling, which means they lack experience of what is expected informal teaching and learning situations, rendering the teaching of test-taking techniques difficult.Ultimately, we may question whether it is fair to expect migrant learners with little or no previouseducational experience to possess appropriate and adequate frame interpretations for a speakingtest. If not, other alternative assessment approaches may have to be explored.

7.2. Implications of interlocutor variation

Standardisation for reliability will perhaps be at the expense of validity, a contention whichbrings to the fore the question of what the test is a test of (i.e. what is the construct). If all parties

Autho

r's

pers

onal

co

py

54 J. Simpson / Linguistics and Education 17 (2006) 40–55

—interlocutors, assessors, test-takers—agree that the communicative event is a test (i.e. moreinterview-like than conversation-like) then its validity as a test (rather than as something else) isnot so much at risk. In any case, all tests should have specifications that state clearly what theyclaim to test, and what they claim to say about those who pass the test. Regarding the process ofstandardisation and training, testers need to be aware of the problem of interlocutor variation overtime. Brown (2003:19–20) makes the point that ‘. . . interviewer behaviour [is] rarely scrutinisedonce initial training is completed. With the emphasis in tests of second language proficiencybeing increasingly on relatively unstructured naturalistic interactions, however thorough the initialtraining, it is incumbent on test administrators to ensure that interviewers’ styles do not over timebecome so diverse that they present different levels of challenge’. The current rise in number andscope of speaking assessments administered to ESOL learners, even at the very earliest stages oftheir English language learning careers, presents a challenge to testing organisations to maintainthe rigour with which such variation on interlocutor behaviour is kept in check.

Acknowledgements

I would like to thank John Callaghan, Lynda Taylor and Martin Wedell, as well as the reviewersand the co-authors of this volume, for their comments on earlier drafts of this paper.

References

Bachman, L. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.Bartlett, F. C. (1932). Remembering. Cambridge: Cambridge University Press.Brown, A. (2003). Interviewer variation and the co-construction of speaking proficiency. Language Testing, 20/1, 1–25.Brown, A., & Lumley, T. (1997). Interviewer variability in specific-purpose language performance tests. In A. Huhta,

V. Kohonen, L. Kurki-Suonio, & S. Luoma (Eds.), Current developments and alternatives in language assessment:Proceedings of LTRC 96 (pp. 137–150). Jyvaskyla: University of Jyvaskyla.

Brown, G., & Yule, G. (1983). Teaching the spoken language. Cambridge: Cambridge University Press.Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing.

Applied Linguistics, 1–1, 1–47.Cook, G. (1994). Discourse and literature: The interplay of form and mind. Oxford: Oxford University Press.Cook, G. (2000). Language play, language learning. Oxford: Oxford University Press.Eggins, S., & Slade, D. (1997). Analysing casual conversation. London: Continuum.Galaczi, E. D. (2003). Interaction in a paired speaking test: The case of First Certificate in English. University of

Cambridge ESOL Examinations Research Notes 14, November 2003.Goffman, E. (1974). Frame analysis: An Essay on the organization of experience. New York, NY: Harper and Row.Goffman, E. (1981). Forms of talk. Philadelphia: University of Pennsylvania Press.Gumperz, J. J. (1977). Sociocultural knowledge in conversational inference. In M. Saville-Troike (Ed.), Linguistics and

anthropology: Georgetown University Round Table on languages and linguistics 1977. Washington, DC: GeorgetownUniversity Press.

Hymes, D. (1962). The ethnography of speaking. In T. Gladwin, & W. Sturtevant (Eds.), Anthropology and humanbehaviour. Washington, DC: Anthropological Society of Washington.

Hymes, D. (1972). On communicative competence. In J. B. Pride, & J. Holmes (Eds.), Sociolinguistics. Harmondsworth:Penguin.

Hymes, D. (1974). Foundations in sociolinguistics: An ethnographic approach. Philadelphia, PA: University of Pennsyl-vania Press.

Johnson, M., & Tyler, A. (1998). Re-analysing the context of the OPI: How much does it look like natural conversation?In R. Young, & A. W. He (Eds.), Talking and testing: Discourse approaches to the assessment of oral proficiency.Studies in bilingualism 14 (pp. 27–51). Amsterdam and Philadelphia: John Benjamins Publishing Company.

Lazaraton, A. (1992). The structural organisation of a language interview: A conversation analytic perspective. System,20–3, 373–386.

Autho

r's

pers

onal

co

py

J. Simpson / Linguistics and Education 17 (2006) 40–55 55

Lazaraton, A. (2002). A qualitative approach to the validation of oral language tests. In Studies in language testing 14.Cambridge: University of Cambridge Local Examinations Syndicate/Cambridge University Press.

Lumley, T., & Brown, A. (2005). Research methods in language testing. In E. Hinkel (Ed.), Handbook of research insecond language teaching and learning. Mahwah, N.J. and London: Lawrence Erlbaum Associates.

Moder, C. L., & Halleck, G. B. (1998). Framing the language proficiency interview as a speech event: Native and non-nativespeakers’ questions. In R. Young, & A. W. He (Eds.), Talking and testing: Discourse approaches to the assessment oforal proficiency. Studies in bilingualism 14 (pp. 117–146). Amsterdam and Philadelphia: John Benjamins PublishingCompany.

Norton, J. (2005). The paired format in the Cambridge speaking tests. ELT Journal, 59/4, 287–297.Reed, D. J., & Halleck, G. B. (1997). Probing above the ceiling in oral interviews: What’s up there? In A. Huhta, V.

Kohonen, L. Kurki-Suonio, & S. Luoma (Eds.), Current developments and alternatives in language assessment:Proceedings of LTRC 96 (pp. 225–238). Jyvaskyla: University of Jyvaskyla.

Riggenbach, H. (1991). Toward an understanding of fluency: A microanalysis of nonnative speaker conversations. Dis-course Processes, 14, 423–441.

Ross, S. (1998). Divergent frame interpretations in language proficiency interview interaction. In R. Young, & A. W. He(Eds.), Talking and testing: Discourse approaches to the assessment of oral proficiency. Studies in bilingualism 14(pp. 333–353). Amsterdam and Philadelphia: John Benjamins Publishing Company.

Ross, S., & Berwick, R. (1992). The discourse of accommodation in oral proficiency interviews. Studies in SecondLanguage Acquisition, 14, 159–176.

Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organisation of turn-taking for conver-sation. Language, 50/4, 696–735.

Simpson, J. (2005). Cambridge ESOL and the NRDC ESOL Effective Practice Project. University of Cambridge ESOLExaminations Research Notes 19, February 2005.

Tannen, D. (1993). What’s in a frame?: Surface evidence for underlying expectations. In D. Tannen (Ed.), Framing indiscourse. Oxford: Oxford University Press.

Tannen, D., & Wallat, C. (1993). Interactive frames and knowledge schemas in interaction: Examples from a medicalexamination/interview. In D. Tannen (Ed.), Framing in discourse. Oxford: Oxford University Press.

Taylor, L. (2003). The Cambridge approach to speaking assessment. University of Cambridge ESOL ExaminationsResearch Notes 13, August 2003.

University of Cambridge ESOL Examinations KET Handbook. (online) http://www.cambridgeesol.org/support/handbooks.htm.

van Eck, J. A., & Alexander, L. G. (1977). Waystage: An intermediary objective below threshold-level in an Europeanunit/credit system for modern language learning by adults. Strasbourg: Council for Cultural Co-Operation of theCouncil of Europe.

van Lier, L. (1989). Reeling, writhing, drawling, stretching, and fainting in coils: Oral proficiency interviews as conver-sation. TESOL Quarterly, 23/3, 489–508.

van Lier, L. (1996). Interaction in the language curriculum. Awareness, autonomy and authenticity. London and NewYork: Longman.

Vocate, D. (1987). The theory of A. R. Luria: Functions of spoken language in the development of higher mental processes.Hillsdale, NJ: Lawrence Erlbaum.

Widdowson, H. G. (1978). Teaching language as communication. Oxford: Oxford University Press.Widdowson, H. G. (1990). Aspects of language teaching. Oxford: Oxford University Press.Widdowson, H. G. (2003). Defining issues in English language teaching. Oxford: Oxford University Press.Young, R. (1995). Conversational styles in language proficiency interviews. Language Learning, 45, 3–42.