A corpus-based study of the frequency and meaning distribution of modals in the EFL coursebooks of...

74
A corpus-based study of the frequency and meaning distribution of modals 1 A corpus-based study of the frequency and meaning distribution of modals in the English language textbooks of Greek state primary and secondary schools Niovi Hatzinikolaou Aristotle University of Thessaloniki Department of English Language and Literature

Transcript of A corpus-based study of the frequency and meaning distribution of modals in the EFL coursebooks of...

A corpus-based study of the frequency and meaning distributionof modals 1

A corpus-based study of the frequency and meaning distribution

of modals

in the English language textbooks of Greek state primary and

secondary schools

Niovi Hatzinikolaou

Aristotle University of Thessaloniki

Department of English Language and Literature

A corpus-based study of the frequency and meaningdistribution of five modals

Thessaloniki, 2012

ABSTRACT

The present study reflects the renewed interest of language

pedagogy in the issue of authenticity in the EFL classroom

context under the new perspective of pedagogic corpora. The

study involves a comparative analysis of the modal auxiliaries

in terms of frequency and distribution of meaning in the

spoken part of the BNC corpus and in a series of EFL

textbooks used in Greek State primary and secondary schools

(grades 5th, 6th and 1st grade of junior high school). The

primary aim is to examine the use of modals in the EFL

textbooks as opposed to their real world use represented in

the BNC corpus. The findings reveal a rather limited

correspondence and signify the need for incorporating in the

2

A corpus-based study of the frequency and meaningdistribution of five modals

textbooks the most typical or frequent occurrences of modals

in natural communicative situations if we are to achieve a

more comprehensive representation of the native speaker’s use

of language. The results of the study will prove useful for

teachers and textbook designers for both research and

pedagogical purposes. Recommendations and suggestions for

further improvements in EFL textbooks associated with a

broader shift towards corpus based materials are also

identified.

CHAPTER ONE: Introduction

Authenticity in EFL classroom has recently been the

concern of many practitioners of corpus linguistics in

3

A corpus-based study of the frequency and meaningdistribution of five modals

language pedagogy. The renewed interest in the content of

English language curriculum is evident in the increasing

number of corpus based studies that address a wide range of

topics in the area of grammar and vocabulary instruction. The

principal aim of those studies is to examine the relationship

between the school language and the real life language use. To

name some: Ute Römer’s (2004) study on the treatment of modal

verbs and if- clauses in EFL textbooks, Cullen & Kuo’s (2009)

research on the coverage of the conversational grammar and

Biber et al. (2004) study on academic discourse.

The reasons that warrant the current study is a similar

need for Greek students to learn a kind of English that is

used and understood by native speakers. However, we have to

take into account that Greek state school textbooks are mainly

directed at written discourse as in most cases are exam-

oriented. This means that they don’t emphasize on the building

of communicative skills that employ a functional kind of

language. To argue for a move towards communicative textbooks

we will first have to examine the language used in the current

ones. The leading question of our research is: how and to

which extent does the English language taught in Greek state

4

A corpus-based study of the frequency and meaningdistribution of five modals

schools differ from the authentic spoken English. In

particular, the study concentrates on one of the most

problematic areas of English grammar instruction for students

and teachers, the modal verbs. The choice of this particular

grammatical area has been made on the basis on the

significance of modals in functional language and upon

consideration of the early introduction of modal verbs in the

EFL textbooks used in the Greek state primary school. The

study addresses only five basic modals (can, should, may, might,

must).

The research is conducted in two primary school and two

junior high school EFL textbooks and involves the analysis of

the frequency of occurrence of modals and the distribution of

their different meanings. Since we are interested in the

accurate use of modals in communicative contexts, the results

of the analysis are compared with frequency and meaning

distribution data of the spoken subcorpus of BNC corpus, which

is highly representative of native speech.

The findings derived from our research are expected to be

of immense value to teachers and textbook designers in that

5

A corpus-based study of the frequency and meaningdistribution of five modals

they may provide the incentive for the reassessment of the

Greek State school textbooks from a corpus-based perspective.

CHAPTER TWO: Literature review

Corpora can be described as “large collections of databases of

language that include different kinds of discourse” (Schmitt

6

A corpus-based study of the frequency and meaningdistribution of five modals

2000, p.68). More accurately, a corpus is “a principled

collection of texts that represents something and can be used

for qualitative and quantitative analysis” (O’Keefe, Mc Carthy

and Carter, 2007, Biber, Conrad and Reppen, 1998).

Some of the earliest corpora were developed at the

beginning of 1900s. They were constructed manually following a

lengthy procedure of collection and typing of written excerpts

from specific genres of discourse. Considering the effort for

their production at that time “a one million word corpora was

on the large side” (Schmitt, 2000). The real revolution in

corpus linguistics came with the arrival of computers. The

great advances of technology that allowed the scanning and

digital manipulation of text opened the way for the

compilation of large corpora comprising millions of words.

Today “multimillion-word corpora such as the British National

Corpus (BNC, 100 million), the Bank of English (450 million)

or the International Corpus of English (ICE, 20 million) are

widely available and can be run on an ordinary PC or accessed

via the web” (Braun, 2005, p.48).

A corpus offers a new perspective in the description of

language and, as Hunston (2002) points out, it serves as “a

7

A corpus-based study of the frequency and meaningdistribution of five modals

more reliable guide to language use than native speaker

intuition” (p.20) regarding judgments about collocation,

frequency, phraseology and pragmatic meaning. At several

instances where native speaker intuition may fail to justify a

particular linguistic choice, corpora can provide the right

answer or solution.

However valuable they might be, corpora are still subject to

limitations as in numerous cases they cannot function

independently of native speaker intuition. Corpora don’t give

information about whether a particular language feature is

possible or acceptable in that they only provide us with

evidence (Hunston, 2002, Kaltenböck and Mehlmauer-Larcher,

2005). Essentially, they “present language out of its context”

(Hunston, 2002, p.23) and this way many important features

specific to each kind of discourse are lost or

underrepresented.

As mentioned before, corpus research has shed light on

two important aspects of vocabulary, frequency and

collocation. According to Schmitt (2000), knowledge of the

most frequent words in discourse has immediate pedagogical

implications as it facilitates decisions about what vocabulary

8

A corpus-based study of the frequency and meaningdistribution of five modals

is vital to teach in an EFL environment. What is more,

whatever knowledge we have about the collocational behavior of

words comes from corpus observations usually with the help of

specialized programs called concordancers (Schmitt, 2000,

O’Keefe, Mc Carthy and Carter, 2007). Apart from language

description, applications derived from corpus investigations

are found in a number of different areas.

Corpora have definitely revolutionized the area of

lexicography. Corpus data have been incorporated into

dictionaries and grammars to establish authenticity, offer

insights into frequency, collocation and phraseology and

provide information about variation in different registers

(Hunston, 2002). Today all modern widely used dictionaries,

such as the Longman Dictionary of Contemporary English and

Oxford Advanced Learner’s dictionary, are corpus based

(Hunston, 2002).

Hunston (2002) suggests that “since corpora are used to

raise awareness about language in general, they are extremely

useful in training translators and in pointing up potential

problems for translation” (p.123).

9

A corpus-based study of the frequency and meaningdistribution of five modals

Another considerable application we find in stylistics,

where O’Keefe, Mc Carthy and Carter (2007) note that “corpora

provide a novel methodology for analyzing literary texts like

books, poems etc through the study of collocations” (p.18).

Corpora have been also widely used in forensic

linguistics for comparisons between documents (confessions,

letters) to ascertain their genuineness or to identify

authorship in academic settings addressing issues of

plagiarism (O’Keefe, Mc Carthy and Carter, 2007, Mc Enery,

Xiao & Tono, 2006 ).

Finally, corpora have partially contributed to the study

of pragmatics as the easiness and rapidness of processing

enables researchers to examine certain pragmatic features

like hedging politeness devices, discourse markers, irony etc

(Mc Carthy and O’Keefe, 2007).

Notable in the area of language learning is the emergence

of data-driven learning (Johns, 1991), an inductive learning

approach that makes use of the tools of corpus linguistics to

establish authenticity in the classroom (Gilquin & Granger,

2010). Based on the assumption that “effective learning is

itself a form of linguistic research” (Johns, 1991, p. 297),

10

A corpus-based study of the frequency and meaningdistribution of five modals

data driven learning challenges the conventional roles of

teachers and learners. The teacher has to abandon the role of

the expert, taking on that of “research organizer” (Johns,

1991, p. 297, Bernardini, 2004). The learner is encouraged to

discover language for himself through a process of observation

and interpretation of patterns of use found in corpus data,

what is called discovery learning (Bernardini, 2004, Aston,

2009). This direct learning process cuts down the mediatory

role of the reference materials and gives learners the

autonomy to consult corpora and use concordancing software to

find an answer to a problem or draw their own conclusions

about an issue (Chambers, 2010).

“Corpus based descriptions of language have also had an

immediate effect on language teaching methodology and the

content of instruction” (Kennedy, 1998, p.281). Findings about

the frequency of words, their most central uses and their

collocational contexts affect the selection, the sequencing

and the emphasis placed upon the teaching items and help

teachers to set priorities (Kennedy, 1998). Additionally, we

see that the “recurrent fixed sequences identified by corpus

analysis in the English language are now taken seriously in

11

A corpus-based study of the frequency and meaningdistribution of five modals

language pedagogy” (p.289) and their role has become central

for successful learning (Kennedy, 1998).

Since corpora are designed for a specific purpose, we

find a variety of different types of corpora. The first major

distinction to be made is between specialized and general

corpora.

A general corpus includes a wide variety of texts of many

different genres. It is typically used as a reference tool

with applications along different disciplines. The most well-

known general corpora comprising millions of words are the

British National Corpus (BNC) also used in this research and

the Bank of English (Hunston, 2002).

A specialized corpus on the other hand is representative

of a particular genre, such as newspaper editorials, academic

articles learner-produced materials etc; specialized corpora

allow the investigation of “a particular type of language”

(Hunston, 2002, p. 14).

Another distinction would be between comparable and

parallel corpora. The term comparable refers to two or more

corpora in different languages or language varieties that are

used to trace any common or uncommon aspects between them

12

A corpus-based study of the frequency and meaningdistribution of five modals

(Bernardini, 2004). The ICE corpora (International Corpus of

English) are widely used comparable corpora, each one

representing a different variety.

Parallel corpora are limited to texts that have been

translated from one language to another or produced in a

variety of languages. Usually these comprise European Union

documents and are indispensable tools in the work of

translators (Hunston, 2002).

A monitor corpus is a large representative corpus that is

increasing in size, as it is constantly updated with new

information. Monitor corpora like COBUILD corpus are typically

used to “track the current changes in a language” (Hunston,

2002, p. 16).

Mc Enery, Xiao & Tono (2006) also uses the term

‘historical’ or ‘diachronic’ to describe a corpus of texts

taken from different time periods that helps researchers keep

track of linguistic development. The ARCHER corpus is a

notable example of a diachronic corpus.

The last important distinction would be between two

corpora constructed for immediate pedagogic applications.

13

A corpus-based study of the frequency and meaningdistribution of five modals

Granger (2004) defines learner corpora as “electronic

collections of spoken or written texts produced by second or

foreign language learners” (p. 124). Learner corpora provide

researchers with information on learner “interlanguage

development” (p. 11) which can be used to classify recurrent

errors or misuses and determine the difficulty of teaching

items (Aston, 2000). Osborne (2002) notes that learner

corpora have other similar applications in language teaching

in that teachers may use them as a basis for language

awareness exercises that would be directed towards typical

problems students face (lexical overuse, grammatical errors,

L1 transfer). Moreover, the comparison of the learner output

with data from a native speaker corpus makes learners aware of

their deviant uses and helps them correct them (Osborne, 2002,

Hunston, 2002).

A pedagogic corpus contains an amount of language that

is representative of what students have been exposed to

throughout a language course, including textbook contents and

transcripts of the taped material used (Hunston, 2002). This

language is usually compared with the language in a native

speaker corpus to determine its degree of authenticity or

14

A corpus-based study of the frequency and meaningdistribution of five modals

usefulness. Teachers may also exploit such corpora to present

their students with all the different occurrences of a

language feature in the textbook and draw their attention to

its use (Hunston, 2002).

This potential of corpus linguistics has been exploited

by a number of researchers in the field of applied linguistics

who have conducted comparative studies to examine the

differences between native language and the language students

are exposed to through EFL textbooks. In most cases, this kind

of studies entail the compilation and use of pedagogical

corpora.

Ute Römer’s study involves the observation of EFL

textbook language in relation to the kind of language likely

to occur in “natural communicative situations” (Römer, 2004b,

p. 152).

By her analysis of the EFL textbooks in German schools,

Römer (2004b) suggests that the language found in books,

especially in dialogues, tends to be simplified and unnatural

compared to the real everyday language. She argues that “it’s

rather doubtful whether texts like this can better serve the

15

A corpus-based study of the frequency and meaningdistribution of five modals

purpose of preparing learners for the English that they are

likely to encounter in real life” (Römer, 2004b, p. 156).

The GEFL corpus and if-clauses

Römer’s search for authenticity in ELT goes further with the

compilation of the German English as a Foreign Language

Textbook Corpus (GEFL), a computerized version of a collection

of German EFL textbook texts that enables her to examine

rapidly and efficiently the occurrences of particular items in

context (Römer, 2004b). The GEFL corpus consists of two

subcorpora of two major EFL coursebooks, widely used in

Germany secondary schools, Green Line New and English 2000. As

“spoken-type texts like dialogues, interviews better serve the

purposes of communicative competence and prepare students for

any prospective discourse with native speakers of English ”

(Römer, 2004b, p. 156), Römer excludes literary written

samples from the collection and limits the comparison between

GEFL corpus and BNC data to spoken language. She focuses on a

problematic area of English language for teachers and

learners, the if-clauses, with special attention on their use in

conditional constructions and their collocational behavior

(Römer, 2004b).

16

A corpus-based study of the frequency and meaningdistribution of five modals

According to the analysis, EFL textbooks tend to

overemphasize certain tense combinations in if-clauses, normally

referred to as Type 0, Type 1, Type 2 and Type 3 Conditionals

in regular grammars, whereas the most frequent uses in BNC are

“underrepresented” (Römer, 2004b, p. 160) or even left out.

Römer (2004b) underlines that to achieve authenticity

there is a need for accurate contextualization of lexical

items, with “the use of contexts in which they typically

appear in actual language use” (p.161). More importantly, it

is necessary to help students become more confident in

extending the use of structures beyond what they have been

taught in the textbook examples (Römer, 2004b).

Modal Verbs

In addition to her research on if-clauses, Römer uses the GEFL

corpus to address another problematic grammatical phenomenon,

the modal verbs. The issues under investigation include the

coverage of the native speaker English in the textbook

materials and the crucial differences between “school” English

and “real” English at the level of grammar (Römer, 2004a).

17

A corpus-based study of the frequency and meaningdistribution of five modals

Römer (2004a) carried out a BNC corpus-based analysis of

nine modal verbs (can, could, may, might, will, would, shall, should, must

and ought to) that appear in the particular EFL textbook series

to find their frequency, the contexts in which they typically

occur and their different meaning uses in spoken discourse.

She then carried out a co-occurrence analysis to find “the

occurrence of different modals in questions, set phrases, if-

clauses, and passive constructions” (p. 289). The modals

higher in frequency were will/’ll, would/’d and can. As was proved by

the data, can is used to express three different meanings:

ability (36%), possibility (31.5%) and permission (23.5%)

(Römer, 2004a).

At the second part of the survey, Römer (2004) went

through the same process with the EFL German textbook, Learning

English Green Line (Vols 1-6) and the grammar book, Learning English

Grundgrammatik.

The last part of the survey involved the comparison of

the findings in the textbook and corpus analysis in terms of

frequency, distribution of meanings and co-occurrence. Römer

(2004a) concludes that “this comparison makes it clear that

there are huge discrepancies between the use of modal

18

A corpus-based study of the frequency and meaningdistribution of five modals

auxiliaries in authentic English and in the English taught in

German schools” (p. 193). She observes, for instance, that

specific modals like will/’ll, can and must tend to be overused over

the rest (Römer, 2004a). She notices as well, that textbooks

give unreasonable precedence to lower in frequency meanings of

modals (i.e the permission meaning of may), whereas meanings

commonly used in speech are underrepresented or completely

left out (Römer, 2004a). In addition, the negative form of

some modals is used in higher percentages in textbooks and

there is absence of important fixed phrases of everyday speech

like “if I may” (Römer, 2004a).

All those mismatches of corpus and textbook data signify

a gap between the spoken language and the “school” language.

As a result, Römer (2004a) insists on the introduction of a

more native like kind of English in the EFL textbooks and

proposes that the modals be presented as a group rather than

individually, so as to make students aware of their

distinctive properties compared to other verbs (Römer, 2004a).

They should also be taught in the order of frequency they are

found in the corpus. Similarly, “if we are to enable pupils to

communicate successfully” priority should be given to the

19

A corpus-based study of the frequency and meaningdistribution of five modals

meanings of modals that are most preferred in speech (Römer,

2004a, p. 196). In this respect, Römer emphasizes the

development of new pedagogical material that will introduce

corpus generated examples instead of ready-made ones. Corpus

data thus will help students get an insight of how language is

used by native speakers and view the different contexts to

which the use of a grammatical feature may be extended.

Similar studies that involve the compilation of textbook

corpora have been conducted by Biber et al. (2004) and Anping

(2005), as cited in Meunier & Gouverneur (2009). Anping

constructed a corpus out of texts taken from EFL textbooks of

Chinese learners and from international corpora. The aim of

his research was to find out if the EFL textbooks are

consistent with the current approaches in teaching and

learning (Meunier & Gouverneur, 2009). Biber, on the other

hand, focused on academic discourse using a corpus of written

material from academic textbooks. The corpus investigation

revealed the use of particular linguistic features in EAP

textbooks and classrooms in American universities (Meunier &

Gouverneur, 2009).

20

A corpus-based study of the frequency and meaningdistribution of five modals

Richard Cullen and I-Chun Kuo (2007) are among other

researchers in the field who have argued that the findings of

corpus research should be incorporated into pedagogical

materials. They draw their attention to the treatment of

spoken grammar in the EFL textbooks published in the United

Kingdom. They point out that this kind of “conversational”

(p.362) grammar used in informal-everyday of speech events

needs to be included and accurately represented in the EFL

textbooks (Cullen & Kuo, 2007). As a result, they focus on

particular grammatical features taken “from corpus

descriptions of standard, non dialectal conversational

English” (p. 362) and selected in terms of frequency. As

Cullen & Kuo (2007) claim, apart from frequency of occurrence,

knowledge of such features facilitates the communicative

interaction of students with native speakers.

To examine the current coverage of the conversational

features in the EFL textbooks, the researchers categorize them

into three different categories. Category A refers to the

situational ellipsis, “noun phrase prefaces known as heads”

(Carter & McCarthy, 1997) and noun phrase tags (tails) (Cullen &

Kuo, 2007, p. 366-367). In Category B, they include fixed

21

A corpus-based study of the frequency and meaningdistribution of five modals

words or phrases like actually, really, kind of/short of, you know etc.

that are frequently used in everyday speech for a variety of

functions (hedging, imprecision, vagueness) (Cullen & Kuo,

2007). Category C consists of lexical units that have

alternative “ungrammatical” uses in informal conversational

contexts (less instead of fewer with countable plural nouns or

was rather than were in second conditional structure) (Cullen &

Kuo, 2007).

They survey 24 mainstream EFL textbooks that cover five

different proficiency levels to find out “whether learners are

made aware of such phenomena or whether they are only

presented with the forms traditionally felt to be correct”

(Cullen & Kuo, 2007, p. 371).

As summarized in the findings of their research, Cullen &

Kuo (2007) observe that Category A tail and head structures

occur only in some of the advanced-level textbooks while

ellipsis occurs only in two of the 24 books. Special attention

is given to the Category B features especially in the upper

intermediate and intermediate levels, as expressions like a kind

of/short of, a bit, I mean and you know are found to be common in the

textbooks. As for Category C features, they note that only

22

A corpus-based study of the frequency and meaningdistribution of five modals

four of the textbooks surveyed provide some feedback (Cullen &

Kuo, 2007).

Cullen & Kuo (2007) conclude that even though EFL

textbooks of the British market have attempted to incorporate

some aspects of spoken grammar they still come up with a

rather limited representation of the variety of features it

consists of. Especially where the educational goal is

communicative competency and engagement with native speakers,

the development of EFL textbooks needs to be targeted to a

more adequate representation and practice of conversational

grammar even in the case of pre intermediate and intermediate

learners (Cullen & Kuo, 2007).

Meunier & Gouverneur’s studies (2009) concentrate on the

use of phraseology in the context of EGP textbooks. Their

research is based on a huge corpus of the most widely used EGP

textbook series for both intermediate and advanced levels

(TeMA corpus). Apart from texts, TeMa corpus includes

transcripts of all taped material and vocabulary exercises

along with the guidelines, providing thus “a richness of

pedagogic input” (Meunier & Gouverneur, 2009, p. 187). Most

importantly, the TeMa corpus is “pedagogically annotated”

23

A corpus-based study of the frequency and meaningdistribution of five modals

(Meunier & Gouverneur, 2009, p. 186). Specific tags were

applied to each one of the vocabulary exercises and tasks

according to its purpose, e.g. matching, and to whether the

vocabulary used in the activity is given in advance. According

to Meunier & Gouverneur (2009), this kind of coding of the

vocabulary part of the corpus allows for exploitation of the

data from multiple perspectives. Researchers are able to

compare textbooks representing different levels in terms of

selection of vocabulary and investigate their input

(expressions, words) and the way it is practiced. The findings

serve as useful feedback for the textbook authors for future

improvements.

In her 2008 pilot study, Gouverneur reports a complete

inconsistency in the use of collocations across different

textbooks though a consistency in the use of particular tasks

to all levels. Her study also indicates the need for tasks

that will practice the cognitive skills of students such as

noticing, opening the way for potential improvements. Such

improvements could be accompanying the textbook with a CD ROM

which will include a variety of teaching and learning

materials such as concordance lines, corpus-based or data-

24

A corpus-based study of the frequency and meaningdistribution of five modals

driven activities and extra examples for further practice

(Meunier & Gouverneur, 2009).

The TeMa corpus may also help teachers to evaluate the

textbook design. An important issue is the type of

metalanguage used in textbooks to describe or categorize

vocabulary items and aspects of phraseology (Meunier &

Gouverneur, 2009) “A pilot study in 2007 has shown that the

metalanguage used in textbooks, and especially in the

guidelines to exercises is far too general and indirect”

(Meunier & Gouverneur, 2009, p. 196) As a result, the use of

more specific terms such as fixed idioms, collocations etc to

refer to lexical units in a textbook would better aid the

students conceptual organization and understanding (Meunier &

Gouverneur, 2009).

25

A corpus-based study of the frequency and meaningdistribution of five modals

CHAPTER THREE: Methodological framework

Materials

The aim of the current study is to determine the degree

of authenticity of the language used in the Greek State

school EFL textbooks, with specific attention to modal verbs.

In particular, the research question to be investigated is the

relevance of the use of five modals in the Greek State school

EFL textbooks with the real word use, as reflected in the

spoken part of BNC corpus.

To answer the research question stated above, we conducted a

corpus based analysis of the modals under investigation in

terms of frequency and occurrence of different meanings in

four school textbooks. The textbooks are: English 5th grade: Pupil’s

and activity book by E. Kolovou & A. Kraniotou (level A1/A1+),

English 6th grade: Pupil’s and activity book by E. Efremidou., E. Zoe- Repa

& F. Fruzaki (level A2-/A2), Think teen, 1st grade of junior high school (for

beginners): Student’s and activity book by E. Karagianni, V. Koui, & A.

Nikolaki and Think teen, 1st grade of junior high school (for advanced learners):

Student’s and activity book by E. Karagianni, V. Koui, & A. Nikolaki

(level BI-/B1).

26

A corpus-based study of the frequency and meaningdistribution of five modals

Both frequency and different meaning analysis were

carried out on the electronic versions of the four school

textbooks -our pedagogic corpora- with the help of specialized

software MonoConc 2.2. MonoConc 2.2 is a fast text searching

program with an excellent user-interface that includes many

features, such as the ability to create word lists (in both

alphabetical order and frequency order), generate concordance

output and give collocation information.

Procedure

At this stage we made use of the concordancer. Each one of the

pedagogic corpora was loaded into the program where we chose

the option of text search. Once we introduced the target word

(also called node word or keyword) in the text search, in this

case the modal verb under examination, we were presented with

all the occurrences of the word in the whole textbook.

Although the search was conducted automatically, the

interpretation of the findings called for manual work, as

every single use of the modal had to be examined thoroughly.

The reason for that is that, in order to identify the meaning

use of the modal in each occurrence, we need to consider the

full context or co-text in which it is found. The co-text is

27

A corpus-based study of the frequency and meaningdistribution of five modals

provided in the upper window of the concordance listing when

clicking on the particular sentence. The close examination of

each modal occurrence enabled us to exclude from the search

all the instances in which the modal was not used within a

sentence or a meaningful context (i.e grammatical

presentation). After finding out the number of the occurrences

corresponding to each meaning use we ranked each use in order

of frequency and drew conclusions about which meaning uses are

prioritized along the four textbooks.

To carry out the frequency analysis, we counted all the

matches, that is, the occurrences of each modal in the four

textbooks. The occurrences in which the modals were not used

in meaningful contexts were left out.

The results from both analyses were subsequently used (a)

for comparison among the four textbooks and (b) for comparison

between the pedagogic corpus and the BNC corpus in order to

draw conclusions regarding the use of particular modal verbs

by native speakers and their selection made by coursebook

writers for EFL teaching purposes.

For reasons of practicality and economy of effort and

time, the research was based on the British National Corpus

28

A corpus-based study of the frequency and meaningdistribution of five modals

(BNC) data provided by Ute Römer (2004). Römer limited her

research to spoken material as it is proven that modals occur

much more frequently in conversational English. In this

respect, she argued for the introduction of instances of

functional communicative use of language in the EFL textbooks,

traditionally targeted on written discourse. The same concern

leaded us to focus on the spoken discourse. As a result, the

study built on the frequencies of five modals (can, should, may,

might, must) in the 10 million word spoken part of the BNC

corpus. The different meanings of the modals in the BNC

corpus, identified by Römer, were also used as such in the

current study.

Evidently, the criteria for the selection of the

particular methodology were the availability of the electronic

corpora of the four school textbooks and the MonoConc software

which facilitated the whole process of the analysis and

allowed manipulation of the textbook corpora in various ways.

A manual analysis would be far too challenging. Among the

advantages of the particular methodology, we can list the ease

and speed of processing and the potential of expanding the co-

29

A corpus-based study of the frequency and meaningdistribution of five modals

text in a concordance line to obtain a better sense of the

meaning use.

30

A corpus-based study of the frequency and meaningdistribution of five modals

CHAPTER FOUR: Results

This chapter aims to present the outcomes of the

frequency and meaning distribution analyses carried out for

each of the textbooks (5th grade up to 1st grade of junior high

school). The findings will allow us to provide answers to the

research questions set in the first chapter.

Initially, we will focus on how the particular modals are

presented and described in each of the textbooks. In all four

textbooks the introduction of the modals follows a concrete

pattern. There is a mention of their meanings in a separate

grammar corner within the context of the unit and a more

detailed description of their usage in the grammar section

towards the end of the book.

In the fifth grade pupil’s book (p. 68) students are

acquainted with three modal verbs in the following order: can,

must and should. May and might may not be taught directly but are

found with a considerable frequency in the textbook.

Can is used to talk about ability/ties, must for

obligation and should for giving advice. In the grammar appendix

the three modals are listed in a similar way. There we also

31

A corpus-based study of the frequency and meaningdistribution of five modals

find a meaning distinction as it is clarified that must is

stronger than should.

To get a broader view of how these particular modals are

treated at this proficiency level (A1+/A2- level) and identify

any common aspects to this level, we need to examine the

meaning uses covered and the terms used to describe them in

other EFL textbooks of the same level. In addition, we need to

consider the order in which the modals are presented in the

textbooks as it may be reveal something about their status or

significance.

We may look for instance at two textbook series

circulating in the Greek market, viz. The Wonderkids (Longman)

and Upstream 1 (Express Publishing). In the former, the only

modals introduced are can and must. Can is presented first and is

found with the meanings of ability and permission. Must is

used to express obligation. In the grammar section the meaning

uses of the two modals are explained in the mother tongue

probably to ensure comprehension. Additionally, ways to

respond to questions for permission, using can, are suggested.

In Upstream 1, on the other hand, apart from can and must, we

also find should. The meanings given for must and should are

32

A corpus-based study of the frequency and meaningdistribution of five modals

similar to the 5th grade textbook, though it is noticeable that

can is addressed only with the permission meaning.

A comparison among the three textbooks indicates an

agreement in the selection of the meaning uses of must and

should (i.e. obligation, advice, respectively). The absence of

an explicit mention of may and might is common to all three

textbooks. This may well suggest that can and should, which are

found with higher frequency than might and may in the BNC, are

considered a good starting point. It is also worth mentioning

that the modals are introduced in a similar order (can, must,

should) in all of the textbooks.

At this point we also need to consider the frequency

distribution of the modals in the 5th grade textbook and

compare it to the one represented in the BNC corpus, to figure

out the prominence of each modal in grammar teaching.

33

A corpus-based study of the frequency and meaningdistribution of five modals

Figure 1. Relative frequency of modals in the 5th gradetextbook and the BNC corpus

The figure above demonstrates the relative frequency of

modals in the BNC corpus and the 5th grade textbook and allows

us to draw some interesting conclusions regarding the

treatment of each modal in the context of this EFL textbook. A

first glance would reveal that nearly all of the modals (can,

must, should, may) in the EFL textbook are used with much higher

34

A corpus-based study of the frequency and meaningdistribution of five modals

frequency than the BNC corpus. Significant is the case of can

where we have an extremely high frequency rate accounting for

over 50% of the occurrences of the overall modal occurrences.

Such an overuse would imply an uneven distribution of modals

along the textbook. What is more, irrespective of can that

occupies the first position in both textbook and the BNC, the

frequency order of the modals in the textbook seems to bear no

resemblance with that of BNC corpus (can, should, might, must, may)

illustrated in the figure. To mention the case of might, a

modal evidently frequent in spoken language that is found in a

prominently low frequency rate in the EFL textbook.

The results of the comparison between the BNC and the

textbook regarding the meaning distribution of modals are

collected in the Table 1. below. The first percentage under

each meaning category represents the frequency of the

particular meaning in the overall occurrences of this modal in

the textbook and BNC corpus accordingly.

Table 1. Different meaning distribution of can in the 5th

grade textbook and the BNC

ability possibilit permission

35

A corpus-based study of the frequency and meaningdistribution of five modals

can

y

5th grade textbook

81.4% 4.1% 14.4%

BNC

36% 31.5% 23.5%

Table 2. Different meaning distribution of may in the 5th

grade textbook and the BNC

may

possibility

permission

5th grade textbook

88.8% 11.1%

BNC

83% 13%

36

A corpus-based study of the frequency and meaningdistribution of five modals

Table 3. Different meaning distribution of must in the 5th

grade textbook and the BNC

must

obligation

inference/deduction

5th grade textbook

86% 14%

BNC 52% 39%

Table 4. Different meaning distribution of should in the 5th

grade textbook and the BNC

should

advice

hypothetic

5th grade textbook

100%

BNC 62.5%

30%

Table 5. Different meaning distribution of might in the 5th

grade textbook and the BNC

might

possibility

permission

5th grade textbook

100%

37

A corpus-based study of the frequency and meaningdistribution of five modals

BNC 95% 3.5%

First of all, we observe that the hypothetical meaning of

should accounting for a considerable number of occurrences in

the BNC corpus (30%) is not addressed at all in the textbook.

The same goes for might where we have no mention, even at a

lower degree, of the permission meaning. Looking at the

meaning distribution of may we may argue that both of its

meanings uses, highly frequent in speech, are covered

sufficiently in the textbook. In the case of must however, the

obligation meaning is accentuated over that of inference

deduction. Can is used primarily with the ability meaning while

the other two meanings are covered inadequately.

With respect to the 6th grade pupil’s book (level A2-/A2),

we observe that the modal verbs presented are can, may and

should. Might and may are present even though not commented on

directly. In this textbook, modals are taught in a rather

inductive way since students, instead of being presented with

explicit information right from the beginning, are invited to

guess or infer from the context the meaning expressed by each

modal. Only at the grammar section where all grammatical items

38

A corpus-based study of the frequency and meaningdistribution of five modals

introduced in the textbook are described in detail, we do find

information on their meaning uses. The meanings identified

are: ability, possibility and permission for can, possibility

and permission for may and advice for should. There is also some

information on the different usage of may and can when asking

for permission (can is preferred in informal speech).

Grammar Wheels 3 (level A2) published by Hillside Press

offers a lengthy account of the various meanings of modals in

the target language. Can is used for expressing ability and

for asking or giving permission. Might and may are both

addressed with the meaning of possibility, the latter with the

permission meaning as well. As for must, it is used in the

textbook either to denote personal obligation or order/command

while it can be also used for giving advice. There is also

reference in the deduction/inference meaning. In Top Score 3

(Oxford) on the other hand, we find only may, might and must with

the aforementioned meanings.

At this level we come across more detailed or thorough

descriptions of the five modals; in particular, more than one

meaning use of each modal verb is introduced and practiced.

However, even though we find a similar order of introduction

39

A corpus-based study of the frequency and meaningdistribution of five modals

concerning the modals present in all of the textbooks and a

consistency in the terms used to refer to their meanings, the

three textbooks do not have much in common. Grammar Wheels

addresses all modals under investigation with no exception

whereas Top Score 3 and the 6th grade textbook just three of

them.

Figure 2. Relative frequency of modals in the 6th grade textbook and the BNC corpus

40

A corpus-based study of the frequency and meaningdistribution of five modals

As far as the frequency of the modals in the textbook is

concerned, the figure above indicates an excessive use of can

over the rest of the modals when compared with their frequency

distribution in the BNC corpus. The order of frequency is

also completely different. May is found in the second place

with should, must and might following whereas in the BNC the

modals are sequenced as can, should, might, must and may. In

addition, most of the modals occur with much higher frequency

than in the BNC. Might and should are the only modals

underrepresented. Most striking perhaps is the case of might in

that its frequency rate in the textbook is essentially lower

than in the BNC.

Table 6. Different meaning distribution of can in the 6th

grade textbook and the BNC

can

ability possibility

permission

5th grade textbook

87.1% 10.9% 1.8%

BNC

36% 31.5% 23.5%

41

A corpus-based study of the frequency and meaningdistribution of five modals

Table 7. Different meaning distribution of may in the 6th

grade textbook and the BNC

Table 8. Different meaning distribution of must in the 6th

grade textbook and the BNC

must

obligation

inference/deduction

5th grade textbook

77.7% 22.2%

BNC 52% 39%

Table 9. Different meaning distribution of should in the 6th

grade textbook and the BNC

advice

hypothetic

may

possibility

permission

5th grade textbook

91.7 % 8.2 %

BNC

83% 13%

42

A corpus-based study of the frequency and meaningdistribution of five modals

should5th grade textbook

100%

BNC 62.5%

30%

Table 10. Different meaning distribution of might in the6th grade textbook and the BNC

might

possibility

permission

5th grade textbook

100%

BNC

95% 3.5%

As for the different meaning uses, in the textbook, can is

primarily used to express ability and less frequently

possibility. The frequency gap between the two meaning uses is

much wider than in the BNC corpus (31.5% possibility, 36%

ability). Additionally, the permission meaning used in 23% of

the cases in the BNC corpus is hardly acknowledged in the

textbook. The frequency rates for the meaning uses of may and

might seem to be more consistent with the BNC evidence. Yet,

this does not apply to should and might where once more we find

43

A corpus-based study of the frequency and meaningdistribution of five modals

no instances of other meaning uses other than the most

frequent ones.

None of the modals under investigation are presented in

the 1st grade of junior high school textbook (for beginners)

except for should. This absence can not be understood or

justified if there is supposed to be a continuity in the

grammar instruction of primary education that includes the

recycling of past grammar items along with the introduction of

new ones. Nevertheless and despite the fact that no modal is

explicitly taught or practised, the frequency count (Figure 3

below) reveals a recurrent use of the particular modals

throughout the textbook.

44

A corpus-based study of the frequency and meaningdistribution of five modals

Figure 3. Relative frequency of modals in the 1st grade ofjunior high school textbook (for beginners) and the BNC corpus

As we can see in the figure, there is a huge frequency

gap between can, on the one hand, and the other four modals, on

the other. It is also evident that the modals do not occur

with the frequency order suggested by the BNC (can, should, might,

must, may). For instance, might is the third most frequent one in

BNC (3.93%) whereas in the textbook it comes in the last

45

A corpus-based study of the frequency and meaningdistribution of five modals

place. Similarly, the frequency difference of might and must is

very small in the BNC but much greater in the textbook.

Additionally, it seems that most of the modals in the textbook

are used more regularly than in the BNC corpus. The only

exception is might where we notice a significant underuse.

Furthermore, the different meanings analysis (Table 3.)

demonstrates that only can occurs with a variety of meanings

(ability, permission, obligation). The rest of the modals are

used exclusively with their most frequent meaning use

according to the BNC corpus. This suggests that equally

important meaning uses are overlooked.

Table 11. Different meaning distribution of must in the 1st grade of junior high school textbook (for beginners) and the BNC

can

ability possibility

permission

5th grade textbook

78.6% 12.5% 8.8%

BNC

36% 31.5% 23.5%

46

A corpus-based study of the frequency and meaningdistribution of five modals

Table 12. Different meaning distribution of must in the 1st

grade of junior high school textbook (for beginners) and theBNC

Table 13. Different meaning distribution of must in the 1st

grade of junior high school textbook (for beginners) and theBNC

must

obligation

inference/deduction

5th grade textbook

100%

BNC 52% 39%

Table 14. Different meaning distribution of should in the 1st

grade of junior high school textbook (for beginners) and theBNC

advice

hypothetic

may

possibility

permission

5th grade textbook

100%

BNC

83% 13%

47

A corpus-based study of the frequency and meaningdistribution of five modals

should5th grade textbook

100%

BNC 62.5%

30%

48

A corpus-based study of the frequency and meaningdistribution of five modals

Table 15. Different meaning distribution of might in the 1st

grade of junior high school textbook (for beginners) and theBNC

might

possibility

permission

5th grade textbook

100%

BNC 95% 3.5%

Finally, the 1st grade of junior high school textbook for

advanced learners presents the following modal verbs: must, may,

might and should. According to the textbook, must signifies

something really important to happen while it is also used for

making guesses. Similarly, might/ may are used as to indicate

that we are not sure whether something is going to happen or

not. No particular information is provided about the meaning

of should although it is introduced implicitly under the

general category of ways of giving advice. In the grammar

section the same information is provided about the particular

modals. Must and may/might are termed as modals of certainty

and uncertainty, respectively . Should is presented as another

49

A corpus-based study of the frequency and meaningdistribution of five modals

way, among others, of giving advice (the others being why

don't you ….., a good idea is to ….., etc)

To see how these modals are treated in other textbooks

of A2+/B1- level we will have a look at Heroes 3 published by

Oxford Press and Cosmic published by Pearson Longman.

In Heroes 3, must is presented as a modal verb used for

obligation and rules and may/might as verbs that allow us to

say that something is possible in the future. Should is used for

giving advice or for indicating the right way to do something.

In Cosmic must is found to express certainty (deduction)

and obligation while it can also be used for giving advice.

Can has the meanings of ability and permission or request,

should that of giving advice and may that of possibility and

permission. Might is not presented at all.

Perhaps the common characteristic in all three EFL

textbooks of this level is the fact that they address, to a

certain extent, all five modals under investigation. In all of

them we find quite similar references regarding should, must and

may/might meanings. However, none of the EFL textbook accounts

for the permission meaning of might.

50

A corpus-based study of the frequency and meaningdistribution of five modals

To figure out the frequency of the particular modals in

the textbook and compare it with BNC, we have to consult

Figure 4 below.

51

A corpus-based study of the frequency and meaningdistribution of five modals

Figure 4. Relative frequency of modals in the 1st grade ofjunior high school textbook (for advanced learners) and theBNC

The frequency analysis of the particular modal verbs in

this textbook reveals an overemphasis on can in relation to the

other modals. More importantly, we find a discrepancy in the

frequency rates of the textbook compared to those of the BNC

corpus. Once again the frequency order of the modals does not

52

A corpus-based study of the frequency and meaningdistribution of five modals

correspond to the one suggested in the BNC corpus (can, should,

might, must, may).

Regarding the meaning distribution, we see that even

though all meanings of can occur in the textbook context, the

possibility meaning is underrepresented as it only covers a

minor percentage (7.1%). May, might and should are addressed only

with their primary meaning use, while must seems to be more

often used to express obligation.

Table 16. Different meaning distribution of can in the 1st

grade of junior high school textbook (for advanced learners)and the BNC

can

ability possibility

permission

5th grade textbook

83.1% 7.1% 9.7%

BNC

36% 31.5% 23.5%

Table 17. Different meaning distribution of may in the 1st

grade of junior high school textbook (for advanced learners)and the BNC

53

A corpus-based study of the frequency and meaningdistribution of five modals

Table 18. Different meaning distribution of must in the 1st

grade of junior high school textbook (for advanced learners)and the BNC

must

obligation

inference/deduction

5th grade textbook

88 % 12%

BNC 52% 39%

Table 19. Different meaning distribution of should in the1st grade of junior high school textbook (for advancedlearners) and the BNC

may

possibility

permission

5th grade textbook

100%

BNC

83% 13%

54

A corpus-based study of the frequency and meaningdistribution of five modals

should

advice

hypothetic

5th grade textbook

100%

BNC 62.5%

30%

Table 20. Different meaning distribution of might in the1st grade of junior high school textbook (for advancedlearners) and the BNC

might

possibility

permission

5th grade textbook

100%

BNC 95% 3.5%

To the same direction, it will be interesting to carry

out an intertextbook comparison, that is, to examine the

frequency rates of the five modals in each of the four

textbooks and compare the findings. A quick overview of the

four frequency figures reveals that the five modals are evenly

distributed along the different textbooks. For instance, can

occurs with the highest frequency in all of the textbooks. Must

55

A corpus-based study of the frequency and meaningdistribution of five modals

is the second most frequent in three of the textbooks (5th

grade, 1st grade of high school-beginner and advanced). The 6th

grade textbook diverges from the rest as we have a higher

percentage of frequency for may (18.7%) which is however close

to that of must (18.3%). Might, on the other hand, accumulates

the lowest percentage of use in all of the textbooks.

In terms of meaning distribution, all of the textbooks

seem to give precedence to the ability meaning of can, whereas

might and should are used exclusively with the meanings of

possibility and advice, respectively. May is used to express

permission in only two of the textbooks as in the rest it only

appears with the possibility meaning. Finally, must in its

overall occurrences is most regularly used to denote

obligation

56

A corpus-based study of the frequency and meaningdistribution of five modals

CHAPTER FIVE: Discussion

Apart from the textbook by textbook analysis, to study the

findings more comprehensively we proceeded in the compilation

of a single textbook corpus comprising the data of the four

textbooks. Figure 5 below exemplifies the relative frequency

of the modals in the BNC and the textbook corpus. The

comparison of the textbook corpus data with that of the spoken

BNC corpus will enable us to figure out how relevant is the

use of modals, as presented in the textbook, with their actual

use in speech.

57

A corpus-based study of the frequency and meaningdistribution of five modals

Figure 5. Relative frequency of modals in the BNC andthe textbook corpus

Evidently, the frequency distribution of the modals in

the textbook corpus differs quite a lot from the one found in

the spoken corpus of the BNC. As we see, can (70.4%) covers over

the majority of the occurrences of modals found in the

textbook corpus. This percentage seems disproportional

compared to that of the BNC corpus, this way indicating an

58

A corpus-based study of the frequency and meaningdistribution of five modals

overuse of the particular modal. Similarly, may, must and

should are used with a higher frequency in the textbook corpus

with the exception of might which presents a significant

underuse. Looking at the frequency percentages, we might also

suggest that the frequency order of the modals in all four

textbooks widely diverges from the one reflected in the BNC

corpus. In particular, must (11.1%) is prioritized over should

(8.15%), which actually seems to be more frequent in speech. It

is also worth mentioning that might (3.93%), which is in the

third position in the BNC scale of frequency, is used the

least in the textbook corpus (2.06%).

At this point, we will have to look for potential

explanations justifying these differences between real life

and the school textbook uses. The fact that certain modals are

introduced as new items and are used extensively in exercises

or reading texts for practice may possibly account for their

overuse over the rest. This is particularly evident in the 5th

and 6th grade textbooks where the three modals explicitly

taught: can, must, should (for 5th grade) and can, may, should (for 6th

grade) occur with much higher frequency.

59

A corpus-based study of the frequency and meaningdistribution of five modals

Furthermore, the complete disregard for the permission

meaning of might and the higher frequency percentages of can

over may with the same meaning use in all of the textbook may

indicate a preference for more informal ways of asking or

giving permission. What is more, the syllabi of the textbooks

of primary school grades seem to focus a lot on the students’

competence in talking about personal abilities and skills.

Presumably this focus explains the emphasis on the ability

meaning of can at the expense of others. Last but not least, we

could justify the absence of the hypothetical meaning of should

upon the assumption that the whole issue of making hypothesis

is to be addressed when students are presented with

conditional structures at a more advanced proficiency level.

The findings of the comparison concerning the five modals

under investigation can be also compared with the findings on

the Green Line corpus, constructed and used by Ute Römer in her

study on modals, expounded in Chapter 2. The main purpose of

her study was to examine if the use of modals in the textbooks

resembles the actual language use. Römer analyzed the

occurrence of 10 modals ( can, could, may, might, will, would, shall,

should, ought to, must) in the 10- million word subcorpus of the

60

A corpus-based study of the frequency and meaningdistribution of five modals

British National Corpus (BNC) to find out their frequency and

their different meaning distribution along with the syntactic

context in which they occur. She followed the same procedure

with the EFL German textbook, Learning English Green Line (Vols 1-6)

and the grammar book, Learning English Grundgrammatik.

The frequency analysis of the Green Line corpus revealed

that can has the highest frequency (101 occurrences). Will/’ll

followed with 95 occurrences, while the frequency of the other

modals (must, would, must, could, may, should, shall, ought to, might) varied

between 5 and 29 occurrences. The least frequent modal was

might (3 occurrences).

In terms of meaning distribution, can was found to express

ability at a 52.5% percentage. Lower in frequency meanings

were those of possibility (24.7%) and permission (22.8%). Might

was exclusively used with the meaning of possibility. May

expressed possibility in 58.3% of the cases and permission in

41.7%. Must was used for obligation in nearly all of its

occurrences (93.1%). A small percentage was covered by the

meaning of inference/deduction (6.9%) Should was mainly used for

advice (80%) and less frequently to express hypothetical

situations (20%).

61

A corpus-based study of the frequency and meaningdistribution of five modals

The comparison of the results with the BNC data proved

that there are several mismatches between the use of modals in

the EFL textbook context and the real word use. In terms of

frequency, can and must tend to be overused compared to should

and might. The order in which the modals are distributed in the

Greenline corpus is also proved to have little connection with

the BNC one.

Furthermore, for can expressing ability the percentages in

the Greenline corpus exceed those of BNC whereas the

possibility and permission meanings seem to be more evenly

distributed. Concerning may, the frequency gap between the

possibility and the permission meaning in the textbook corpus

is much higher than the BNC. Striking is also the fact that,

even though must expresses inference/deduction in 39% of its

occurrences in the BNC, the textbook corpus addresses this

meaning only in 6.9% of the occurrences. Might is used

exclusively to express possibility.

Comparing the findings of the Greenline corpus with our

textbook corpus, we observe that both corpora fail to reflect

the frequency order of modals provided in the spoken part of

the BNC (can, should, might, must, may). Noticeable is the fact that

62

A corpus-based study of the frequency and meaningdistribution of five modals

in both textbook corpora might is used the least. Other common

characteristics are: the overuse of can with an emphasis on the

ability meaning over the rest of the modals, the underuse of

the inference /deduction meaning of must, and the disregard for

the permission meaning of might. The only difference we find is

in the use of should in that the Greenline corpus accounts for

both meanings (hypothesis and advice) quite sufficiently.

The proved lack of representation of corpus data in the

EFL textbooks used in primary and secondary education in

Greece strengthens the need for a reassessment of the existing

teaching materials and the subsequent production of materials

closer to the authentic language use. Braun (2005) suggests

that only corpus-based descriptions of language are capable of

providing realistic and up-to date data which can be used as a

resource for the creation of interesting corpus materials.

Yet, corpus research has had a significant influence on

syllabus design with the development of corpus informed

materials such as the Touchstone series (Mc Carthy, Mc Carten &

Sandiford, 2005), an innovative series for adult and young

adult learners of English that draws on extensive research

into the Cambridge International Corpus of North American

63

A corpus-based study of the frequency and meaningdistribution of five modals

English (CIC). We also need to mention the Collins Birmingham

University International Language Database (COBUILD) project

that has produced a series of teaching materials based on

concordance data from a large corpus of the English language.

Therefore, the four EFL textbooks used in Greek primary

and secondary education need to be redesigned on the basis of

empirical data that offer information on the most frequently

occurring items and the choices native speakers tend to make

in speech. Of course, “frequency data alone cannot dictate

pedagogy” (Conrad, 2000, p. 550). The finding that a

particular meaning use of a modal is less frequent than

another does not mean that all teachers should necessarily

neglect it. A grammatical feature which does not appear

regularly in everyday language may have an important function

in specific kinds of discourse indeed. Worth mentioning is the

use of might and may to give or ask for permission,

particularly in formal registers. As a result, it is important

that a textbook account for these meanings as well. To the

same direction, greater emphasis should be placed on the

permission meaning of can, in that it is widely used in informal

everyday situations and thus important for functional language

64

A corpus-based study of the frequency and meaningdistribution of five modals

use outside the EFL classroom. Moreover, the textbooks should

explicitly state the semantic distinction in the use of can and

may for permission by referring to the degree of formality.

The possibility meaning of can needs to be stressed as well.

Finally, should has to be addressed with the hypothetical

meaning, common in speech according to the frequency rates of

BNC. It would, therefore, be considered an improvement to use

the different meanings of each modal in the textbook in

similar proportions with the BNC corpus.

Speaking of frequency, it would be advisable that the

modals be introduced in the order that are commonly found in

speech, that is, can, should, might, must, may. Even though the most

frequent modals such as can and should need to be prioritized

over less frequent ones, the presence of less frequent modals

in the textbook is necessary as an implicit exposure would be

beneficial for students.

Finally, regarding the presentation of the modals in the

textbooks, we may argue for a move towards a meaning based

categorization. An example of this kind of categorization we

find in the 6th grade textbook where should is presented among

other ways of giving advice. Similarly, the

65

A corpus-based study of the frequency and meaningdistribution of five modals

inference/deduction meaning of must could be introduced and

contrasted along with the negative inference/deduction meaning

of can’t. Accordingly, Can and may/ might could be listed as formal

or informal ways to give or ask for permission. In addition,

there should be a provision about the recycling of previously

taught modals as the students advance to a higher grade. This

should be seriously taken into consideration particularly in

the 1st grade of junior high school textbook syllabus.

66

A corpus-based study of the frequency and meaningdistribution of five modals

CHAPTER SIX: Conclusion

The primary aim of this study was to conduct a corpus-based

comparative analysis regarding the frequency and meaning

distribution of five basic modals (can, must, should, may, might)

used in the EFL textbooks of Greek state primary and secondary

schools ( 5th grade- 1st grade of junior high school).

The analysis proved that the presentation and treatment

of the particular modals in the context of the textbooks

differed considerably from the use of those verbs as this was

suggested by the spoken subcorpus of BNC. Major findings in

terms of frequency were an overuse of certain modals at the

expense of others and an incongruity with the BNC corpus data

in the order of presentation of the modals. As far as the

modal meaning uses are concerned, certain meaning uses were

emphasized over equally important ones while there were cases

where highly frequent meanings were underrepresented or even

left out.

First and foremost, primary/secondary school teachers and

textbook designers are expected to take into consideration the

gap between the content of the teaching materials and the

naturally occurring language. Secondly, they need to closely

67

A corpus-based study of the frequency and meaningdistribution of five modals

cooperate in making principled decisions concerning the

selection and the grading of modals and their meaning uses on

the basis of corpus evidence.

At this point however, it should be made clear that the

results of the current study have shed light only to a very

limited part of the content of the state school EFL textbooks,

that is, the modal verbs. To draw reliable conclusions about

authenticity in the EFL textbooks, the prospective textbook

designers and teachers will have to conduct a more extensive

research that would address other teaching items as well.

Therefore, a lot of corpus-driven work still has to be

done if the teaching content is to become more focused on

conditions of real world use. A significant part of that work

lies in the hands of the corpus linguists and researchers

dealing with the empirical study of language with computer-

assisted techniques. The new generation of EFL teachers who

have received training in corpus-research and are capable of

conducting their own small scale corpus investigations will

also play key role in addressing similar issues. Finally, once

widely disseminated and exploited by the textbook designers,

these descriptive studies have the potential to revolutionize

68

A corpus-based study of the frequency and meaningdistribution of five modals

the English language curriculum enabling students to use a

more authentic kind of language. Consequently, if corpus

linguistics is to have an optimum impact on language learning

and lead to concrete pedagogical applications, the cooperation

between researchers and teaching professionals should be

reinforced.

69

A corpus-based study of the frequency and meaningdistribution of five modals

References

Aston, G. (2000). Corpora and language teaching. In L.

Burnard & T. McEnery, (Eds.), Rethinking language pedagogy from

a corpus perspective: Papers from the third international conference on

teaching and language corpora vol 2, (pp. 7-17). Frankfurt: Peter

Lang.

Bernardini, S. (2004). Corpora in the classroom: An overview

and some reflections on future developments. In J.

Sinclair, (Ed.), How to use corpora in language teaching, (pp. 15-

36). Amsterdam and Philadelphia, PA: John Benjamins.

Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics:

Investigating language structure and use. Cambridge: Cambridge

University Press.

Braun, S. (2005). From pedagogically relevant corpora to

authentic language learning contents. ReCALL, 17(1), 46-

47.

Chambers, A. (2010). What is data driven learning? In A. O’

Keefe, & M. McCarthy (Eds.), The Routledge Handbook of Corpus

Linguistics (pp. 345-356). Routledge

70

A corpus-based study of the frequency and meaningdistribution of five modals

Conrad, S. (2000). Will corpus linguistics revolutionize

grammar teaching in the 21st century? TESOL Quarterly, 34(3),

548-560.

Cullen, R. & Kuo, I-C. (2007). Spoken grammar and ELT course

materials: A missing link? TESOL Quarterly, 41(2), 361 - 386.

Efremidou., Zoe- Repa, E., & Fruzaki, F. (2009a). English 6th

grade: Pupil’s book. Athens: Ministry of Education and

religion.

Efremidou., Zoe- Repa, E., & Fruzaki, F. (2009b). English 6th

grade: Pupil’s workbook. Athens: Ministry of Education and

religion.

Gilquin, G., & Granger, S. (2010). How can data driven

learning be used in language teaching? In A. O’ Keefe, &

M. McCarthy, (Eds.), The Routledge Handbook of Corpus Linguistics

(pp. 359-370). Routledge.

Granger, S. (2004). Computer learner corpus research: Current

status and future prospects. In U. Connor and T. Upton

(Eds.), Applied corpus linguistics: A multidimensional perspective (pp.

123-145). Amsterdam: Rodopi.

Hunston, S. (2002). Corpora in applied linguistics. Cambridge:

71

A corpus-based study of the frequency and meaningdistribution of five modals

Cambridge University Press.

Johns, T. (1991). From printout to handout: Grammar and

vocabulary teaching in the context of data-driven

learning. In T. Odlin (Ed.), Perspectives on pedagogical grammar

(pp. 293-313). Cambridge: Cambridge University Press.

Kaltenböck, G., & Mehlmauer-Larcher, B. (2005). Computer

corpora and the language classroom: on the potential and

limitations of computer corpora in language teaching.

ReCALL, 17(1), 65–84.

Karagianni, E., Koui, V., & Nikolaki, A. (2009a). Think teen, 1st

grade of junior high school (for beginners): Student’s book. Athens:

Ministry of Education and religion.

Karagianni, E., Koui, V., & Nikolaki, A. (2009b). Think teen, 1st

grade of junior high school (for beginners): Student’s workbook. Athens:

Ministry of Education and religion.

Karagianni, E., Koui, V., & Nikolaki, A. (2009a). Think teen, 1st

grade of junior high school (for advanced learners): Student’s book. Athens:

Ministry of Education and religion.

Karagianni, E., Koui, V., & Nikolaki, A. (2009b). Think teen, 1st

grade of junior high school (for advanced learners): Student’s workbook.

Athens: Ministry of Education and religion.

72

A corpus-based study of the frequency and meaningdistribution of five modals

Kennedy, G. (1998). An introduction to corpus linguistics. Harlow:

Longman.

Kolovou, E., K., & Kraniotou, A. (2009a). English 5th grade: Pupil’s

book. Athens: Ministry of Education and religion.

Kolovou, E., K., & Kraniotou, A. (2009b). English 5th grade: Activity

book. Athens: Ministry of Education and religion.

McEnery, T., Xiao, R. & Tono, Y. (2006). Corpus-based language

studies: An advanced resource book. London: Routledge.

Meunier, F., & Gouverneur, C. (2009). New types of corpora for

new educational challenges. In J. Aimer (Ed.), Corpora and

language teaching (pp. 179-198). Amsterdam: John Benjamins.

O’ Keefe, A., McCarthy, M., & Carter, R. (2007). From corpus to

classroom: Language use and language teaching. Cambridge:

Cambridge University Press.

Osborne, J. (2002). Top down and bottom-up approaches to

corpora in language teaching. In U. Connor and T. Upton

(Eds.), Applied corpus linguistics: A multidimensional perspective (pp.

251-265). Amsterdam: Rodopi.

Römer, U. (2004a). A corpus-driven approach to modal

auxiliaries and their didactics. In J. Sinclair (Ed.),

How to use corpora in language teaching, (pp. 15-33). Amsterdam:

73

A corpus-based study of the frequency and meaningdistribution of five modals

John Benjamins.

Römer, U. (2004b). Comparing real and ideal language learner

input: The use of an EFL textbook corpus in corpus

linguistics and language teaching. In G. Aston, S.

Bernardini, D. Steward (Eds.), Corpora and language learners

(pp. 151-160). Amsterdam: John Benjamins.

Schmitt, N. (2000). Vocabulary in language teaching. Cambridge:

Cambridge University Press.

74