Mathematics Curriculum Implementation in Nigerian Junior ...
A corpus-based study of the frequency and meaning distribution of modals in the EFL coursebooks of...
Transcript of A corpus-based study of the frequency and meaning distribution of modals in the EFL coursebooks of...
A corpus-based study of the frequency and meaning distributionof modals 1
A corpus-based study of the frequency and meaning distribution
of modals
in the English language textbooks of Greek state primary and
secondary schools
Niovi Hatzinikolaou
Aristotle University of Thessaloniki
Department of English Language and Literature
A corpus-based study of the frequency and meaningdistribution of five modals
Thessaloniki, 2012
ABSTRACT
The present study reflects the renewed interest of language
pedagogy in the issue of authenticity in the EFL classroom
context under the new perspective of pedagogic corpora. The
study involves a comparative analysis of the modal auxiliaries
in terms of frequency and distribution of meaning in the
spoken part of the BNC corpus and in a series of EFL
textbooks used in Greek State primary and secondary schools
(grades 5th, 6th and 1st grade of junior high school). The
primary aim is to examine the use of modals in the EFL
textbooks as opposed to their real world use represented in
the BNC corpus. The findings reveal a rather limited
correspondence and signify the need for incorporating in the
2
A corpus-based study of the frequency and meaningdistribution of five modals
textbooks the most typical or frequent occurrences of modals
in natural communicative situations if we are to achieve a
more comprehensive representation of the native speaker’s use
of language. The results of the study will prove useful for
teachers and textbook designers for both research and
pedagogical purposes. Recommendations and suggestions for
further improvements in EFL textbooks associated with a
broader shift towards corpus based materials are also
identified.
CHAPTER ONE: Introduction
Authenticity in EFL classroom has recently been the
concern of many practitioners of corpus linguistics in
3
A corpus-based study of the frequency and meaningdistribution of five modals
language pedagogy. The renewed interest in the content of
English language curriculum is evident in the increasing
number of corpus based studies that address a wide range of
topics in the area of grammar and vocabulary instruction. The
principal aim of those studies is to examine the relationship
between the school language and the real life language use. To
name some: Ute Römer’s (2004) study on the treatment of modal
verbs and if- clauses in EFL textbooks, Cullen & Kuo’s (2009)
research on the coverage of the conversational grammar and
Biber et al. (2004) study on academic discourse.
The reasons that warrant the current study is a similar
need for Greek students to learn a kind of English that is
used and understood by native speakers. However, we have to
take into account that Greek state school textbooks are mainly
directed at written discourse as in most cases are exam-
oriented. This means that they don’t emphasize on the building
of communicative skills that employ a functional kind of
language. To argue for a move towards communicative textbooks
we will first have to examine the language used in the current
ones. The leading question of our research is: how and to
which extent does the English language taught in Greek state
4
A corpus-based study of the frequency and meaningdistribution of five modals
schools differ from the authentic spoken English. In
particular, the study concentrates on one of the most
problematic areas of English grammar instruction for students
and teachers, the modal verbs. The choice of this particular
grammatical area has been made on the basis on the
significance of modals in functional language and upon
consideration of the early introduction of modal verbs in the
EFL textbooks used in the Greek state primary school. The
study addresses only five basic modals (can, should, may, might,
must).
The research is conducted in two primary school and two
junior high school EFL textbooks and involves the analysis of
the frequency of occurrence of modals and the distribution of
their different meanings. Since we are interested in the
accurate use of modals in communicative contexts, the results
of the analysis are compared with frequency and meaning
distribution data of the spoken subcorpus of BNC corpus, which
is highly representative of native speech.
The findings derived from our research are expected to be
of immense value to teachers and textbook designers in that
5
A corpus-based study of the frequency and meaningdistribution of five modals
they may provide the incentive for the reassessment of the
Greek State school textbooks from a corpus-based perspective.
CHAPTER TWO: Literature review
Corpora can be described as “large collections of databases of
language that include different kinds of discourse” (Schmitt
6
A corpus-based study of the frequency and meaningdistribution of five modals
2000, p.68). More accurately, a corpus is “a principled
collection of texts that represents something and can be used
for qualitative and quantitative analysis” (O’Keefe, Mc Carthy
and Carter, 2007, Biber, Conrad and Reppen, 1998).
Some of the earliest corpora were developed at the
beginning of 1900s. They were constructed manually following a
lengthy procedure of collection and typing of written excerpts
from specific genres of discourse. Considering the effort for
their production at that time “a one million word corpora was
on the large side” (Schmitt, 2000). The real revolution in
corpus linguistics came with the arrival of computers. The
great advances of technology that allowed the scanning and
digital manipulation of text opened the way for the
compilation of large corpora comprising millions of words.
Today “multimillion-word corpora such as the British National
Corpus (BNC, 100 million), the Bank of English (450 million)
or the International Corpus of English (ICE, 20 million) are
widely available and can be run on an ordinary PC or accessed
via the web” (Braun, 2005, p.48).
A corpus offers a new perspective in the description of
language and, as Hunston (2002) points out, it serves as “a
7
A corpus-based study of the frequency and meaningdistribution of five modals
more reliable guide to language use than native speaker
intuition” (p.20) regarding judgments about collocation,
frequency, phraseology and pragmatic meaning. At several
instances where native speaker intuition may fail to justify a
particular linguistic choice, corpora can provide the right
answer or solution.
However valuable they might be, corpora are still subject to
limitations as in numerous cases they cannot function
independently of native speaker intuition. Corpora don’t give
information about whether a particular language feature is
possible or acceptable in that they only provide us with
evidence (Hunston, 2002, Kaltenböck and Mehlmauer-Larcher,
2005). Essentially, they “present language out of its context”
(Hunston, 2002, p.23) and this way many important features
specific to each kind of discourse are lost or
underrepresented.
As mentioned before, corpus research has shed light on
two important aspects of vocabulary, frequency and
collocation. According to Schmitt (2000), knowledge of the
most frequent words in discourse has immediate pedagogical
implications as it facilitates decisions about what vocabulary
8
A corpus-based study of the frequency and meaningdistribution of five modals
is vital to teach in an EFL environment. What is more,
whatever knowledge we have about the collocational behavior of
words comes from corpus observations usually with the help of
specialized programs called concordancers (Schmitt, 2000,
O’Keefe, Mc Carthy and Carter, 2007). Apart from language
description, applications derived from corpus investigations
are found in a number of different areas.
Corpora have definitely revolutionized the area of
lexicography. Corpus data have been incorporated into
dictionaries and grammars to establish authenticity, offer
insights into frequency, collocation and phraseology and
provide information about variation in different registers
(Hunston, 2002). Today all modern widely used dictionaries,
such as the Longman Dictionary of Contemporary English and
Oxford Advanced Learner’s dictionary, are corpus based
(Hunston, 2002).
Hunston (2002) suggests that “since corpora are used to
raise awareness about language in general, they are extremely
useful in training translators and in pointing up potential
problems for translation” (p.123).
9
A corpus-based study of the frequency and meaningdistribution of five modals
Another considerable application we find in stylistics,
where O’Keefe, Mc Carthy and Carter (2007) note that “corpora
provide a novel methodology for analyzing literary texts like
books, poems etc through the study of collocations” (p.18).
Corpora have been also widely used in forensic
linguistics for comparisons between documents (confessions,
letters) to ascertain their genuineness or to identify
authorship in academic settings addressing issues of
plagiarism (O’Keefe, Mc Carthy and Carter, 2007, Mc Enery,
Xiao & Tono, 2006 ).
Finally, corpora have partially contributed to the study
of pragmatics as the easiness and rapidness of processing
enables researchers to examine certain pragmatic features
like hedging politeness devices, discourse markers, irony etc
(Mc Carthy and O’Keefe, 2007).
Notable in the area of language learning is the emergence
of data-driven learning (Johns, 1991), an inductive learning
approach that makes use of the tools of corpus linguistics to
establish authenticity in the classroom (Gilquin & Granger,
2010). Based on the assumption that “effective learning is
itself a form of linguistic research” (Johns, 1991, p. 297),
10
A corpus-based study of the frequency and meaningdistribution of five modals
data driven learning challenges the conventional roles of
teachers and learners. The teacher has to abandon the role of
the expert, taking on that of “research organizer” (Johns,
1991, p. 297, Bernardini, 2004). The learner is encouraged to
discover language for himself through a process of observation
and interpretation of patterns of use found in corpus data,
what is called discovery learning (Bernardini, 2004, Aston,
2009). This direct learning process cuts down the mediatory
role of the reference materials and gives learners the
autonomy to consult corpora and use concordancing software to
find an answer to a problem or draw their own conclusions
about an issue (Chambers, 2010).
“Corpus based descriptions of language have also had an
immediate effect on language teaching methodology and the
content of instruction” (Kennedy, 1998, p.281). Findings about
the frequency of words, their most central uses and their
collocational contexts affect the selection, the sequencing
and the emphasis placed upon the teaching items and help
teachers to set priorities (Kennedy, 1998). Additionally, we
see that the “recurrent fixed sequences identified by corpus
analysis in the English language are now taken seriously in
11
A corpus-based study of the frequency and meaningdistribution of five modals
language pedagogy” (p.289) and their role has become central
for successful learning (Kennedy, 1998).
Since corpora are designed for a specific purpose, we
find a variety of different types of corpora. The first major
distinction to be made is between specialized and general
corpora.
A general corpus includes a wide variety of texts of many
different genres. It is typically used as a reference tool
with applications along different disciplines. The most well-
known general corpora comprising millions of words are the
British National Corpus (BNC) also used in this research and
the Bank of English (Hunston, 2002).
A specialized corpus on the other hand is representative
of a particular genre, such as newspaper editorials, academic
articles learner-produced materials etc; specialized corpora
allow the investigation of “a particular type of language”
(Hunston, 2002, p. 14).
Another distinction would be between comparable and
parallel corpora. The term comparable refers to two or more
corpora in different languages or language varieties that are
used to trace any common or uncommon aspects between them
12
A corpus-based study of the frequency and meaningdistribution of five modals
(Bernardini, 2004). The ICE corpora (International Corpus of
English) are widely used comparable corpora, each one
representing a different variety.
Parallel corpora are limited to texts that have been
translated from one language to another or produced in a
variety of languages. Usually these comprise European Union
documents and are indispensable tools in the work of
translators (Hunston, 2002).
A monitor corpus is a large representative corpus that is
increasing in size, as it is constantly updated with new
information. Monitor corpora like COBUILD corpus are typically
used to “track the current changes in a language” (Hunston,
2002, p. 16).
Mc Enery, Xiao & Tono (2006) also uses the term
‘historical’ or ‘diachronic’ to describe a corpus of texts
taken from different time periods that helps researchers keep
track of linguistic development. The ARCHER corpus is a
notable example of a diachronic corpus.
The last important distinction would be between two
corpora constructed for immediate pedagogic applications.
13
A corpus-based study of the frequency and meaningdistribution of five modals
Granger (2004) defines learner corpora as “electronic
collections of spoken or written texts produced by second or
foreign language learners” (p. 124). Learner corpora provide
researchers with information on learner “interlanguage
development” (p. 11) which can be used to classify recurrent
errors or misuses and determine the difficulty of teaching
items (Aston, 2000). Osborne (2002) notes that learner
corpora have other similar applications in language teaching
in that teachers may use them as a basis for language
awareness exercises that would be directed towards typical
problems students face (lexical overuse, grammatical errors,
L1 transfer). Moreover, the comparison of the learner output
with data from a native speaker corpus makes learners aware of
their deviant uses and helps them correct them (Osborne, 2002,
Hunston, 2002).
A pedagogic corpus contains an amount of language that
is representative of what students have been exposed to
throughout a language course, including textbook contents and
transcripts of the taped material used (Hunston, 2002). This
language is usually compared with the language in a native
speaker corpus to determine its degree of authenticity or
14
A corpus-based study of the frequency and meaningdistribution of five modals
usefulness. Teachers may also exploit such corpora to present
their students with all the different occurrences of a
language feature in the textbook and draw their attention to
its use (Hunston, 2002).
This potential of corpus linguistics has been exploited
by a number of researchers in the field of applied linguistics
who have conducted comparative studies to examine the
differences between native language and the language students
are exposed to through EFL textbooks. In most cases, this kind
of studies entail the compilation and use of pedagogical
corpora.
Ute Römer’s study involves the observation of EFL
textbook language in relation to the kind of language likely
to occur in “natural communicative situations” (Römer, 2004b,
p. 152).
By her analysis of the EFL textbooks in German schools,
Römer (2004b) suggests that the language found in books,
especially in dialogues, tends to be simplified and unnatural
compared to the real everyday language. She argues that “it’s
rather doubtful whether texts like this can better serve the
15
A corpus-based study of the frequency and meaningdistribution of five modals
purpose of preparing learners for the English that they are
likely to encounter in real life” (Römer, 2004b, p. 156).
The GEFL corpus and if-clauses
Römer’s search for authenticity in ELT goes further with the
compilation of the German English as a Foreign Language
Textbook Corpus (GEFL), a computerized version of a collection
of German EFL textbook texts that enables her to examine
rapidly and efficiently the occurrences of particular items in
context (Römer, 2004b). The GEFL corpus consists of two
subcorpora of two major EFL coursebooks, widely used in
Germany secondary schools, Green Line New and English 2000. As
“spoken-type texts like dialogues, interviews better serve the
purposes of communicative competence and prepare students for
any prospective discourse with native speakers of English ”
(Römer, 2004b, p. 156), Römer excludes literary written
samples from the collection and limits the comparison between
GEFL corpus and BNC data to spoken language. She focuses on a
problematic area of English language for teachers and
learners, the if-clauses, with special attention on their use in
conditional constructions and their collocational behavior
(Römer, 2004b).
16
A corpus-based study of the frequency and meaningdistribution of five modals
According to the analysis, EFL textbooks tend to
overemphasize certain tense combinations in if-clauses, normally
referred to as Type 0, Type 1, Type 2 and Type 3 Conditionals
in regular grammars, whereas the most frequent uses in BNC are
“underrepresented” (Römer, 2004b, p. 160) or even left out.
Römer (2004b) underlines that to achieve authenticity
there is a need for accurate contextualization of lexical
items, with “the use of contexts in which they typically
appear in actual language use” (p.161). More importantly, it
is necessary to help students become more confident in
extending the use of structures beyond what they have been
taught in the textbook examples (Römer, 2004b).
Modal Verbs
In addition to her research on if-clauses, Römer uses the GEFL
corpus to address another problematic grammatical phenomenon,
the modal verbs. The issues under investigation include the
coverage of the native speaker English in the textbook
materials and the crucial differences between “school” English
and “real” English at the level of grammar (Römer, 2004a).
17
A corpus-based study of the frequency and meaningdistribution of five modals
Römer (2004a) carried out a BNC corpus-based analysis of
nine modal verbs (can, could, may, might, will, would, shall, should, must
and ought to) that appear in the particular EFL textbook series
to find their frequency, the contexts in which they typically
occur and their different meaning uses in spoken discourse.
She then carried out a co-occurrence analysis to find “the
occurrence of different modals in questions, set phrases, if-
clauses, and passive constructions” (p. 289). The modals
higher in frequency were will/’ll, would/’d and can. As was proved by
the data, can is used to express three different meanings:
ability (36%), possibility (31.5%) and permission (23.5%)
(Römer, 2004a).
At the second part of the survey, Römer (2004) went
through the same process with the EFL German textbook, Learning
English Green Line (Vols 1-6) and the grammar book, Learning English
Grundgrammatik.
The last part of the survey involved the comparison of
the findings in the textbook and corpus analysis in terms of
frequency, distribution of meanings and co-occurrence. Römer
(2004a) concludes that “this comparison makes it clear that
there are huge discrepancies between the use of modal
18
A corpus-based study of the frequency and meaningdistribution of five modals
auxiliaries in authentic English and in the English taught in
German schools” (p. 193). She observes, for instance, that
specific modals like will/’ll, can and must tend to be overused over
the rest (Römer, 2004a). She notices as well, that textbooks
give unreasonable precedence to lower in frequency meanings of
modals (i.e the permission meaning of may), whereas meanings
commonly used in speech are underrepresented or completely
left out (Römer, 2004a). In addition, the negative form of
some modals is used in higher percentages in textbooks and
there is absence of important fixed phrases of everyday speech
like “if I may” (Römer, 2004a).
All those mismatches of corpus and textbook data signify
a gap between the spoken language and the “school” language.
As a result, Römer (2004a) insists on the introduction of a
more native like kind of English in the EFL textbooks and
proposes that the modals be presented as a group rather than
individually, so as to make students aware of their
distinctive properties compared to other verbs (Römer, 2004a).
They should also be taught in the order of frequency they are
found in the corpus. Similarly, “if we are to enable pupils to
communicate successfully” priority should be given to the
19
A corpus-based study of the frequency and meaningdistribution of five modals
meanings of modals that are most preferred in speech (Römer,
2004a, p. 196). In this respect, Römer emphasizes the
development of new pedagogical material that will introduce
corpus generated examples instead of ready-made ones. Corpus
data thus will help students get an insight of how language is
used by native speakers and view the different contexts to
which the use of a grammatical feature may be extended.
Similar studies that involve the compilation of textbook
corpora have been conducted by Biber et al. (2004) and Anping
(2005), as cited in Meunier & Gouverneur (2009). Anping
constructed a corpus out of texts taken from EFL textbooks of
Chinese learners and from international corpora. The aim of
his research was to find out if the EFL textbooks are
consistent with the current approaches in teaching and
learning (Meunier & Gouverneur, 2009). Biber, on the other
hand, focused on academic discourse using a corpus of written
material from academic textbooks. The corpus investigation
revealed the use of particular linguistic features in EAP
textbooks and classrooms in American universities (Meunier &
Gouverneur, 2009).
20
A corpus-based study of the frequency and meaningdistribution of five modals
Richard Cullen and I-Chun Kuo (2007) are among other
researchers in the field who have argued that the findings of
corpus research should be incorporated into pedagogical
materials. They draw their attention to the treatment of
spoken grammar in the EFL textbooks published in the United
Kingdom. They point out that this kind of “conversational”
(p.362) grammar used in informal-everyday of speech events
needs to be included and accurately represented in the EFL
textbooks (Cullen & Kuo, 2007). As a result, they focus on
particular grammatical features taken “from corpus
descriptions of standard, non dialectal conversational
English” (p. 362) and selected in terms of frequency. As
Cullen & Kuo (2007) claim, apart from frequency of occurrence,
knowledge of such features facilitates the communicative
interaction of students with native speakers.
To examine the current coverage of the conversational
features in the EFL textbooks, the researchers categorize them
into three different categories. Category A refers to the
situational ellipsis, “noun phrase prefaces known as heads”
(Carter & McCarthy, 1997) and noun phrase tags (tails) (Cullen &
Kuo, 2007, p. 366-367). In Category B, they include fixed
21
A corpus-based study of the frequency and meaningdistribution of five modals
words or phrases like actually, really, kind of/short of, you know etc.
that are frequently used in everyday speech for a variety of
functions (hedging, imprecision, vagueness) (Cullen & Kuo,
2007). Category C consists of lexical units that have
alternative “ungrammatical” uses in informal conversational
contexts (less instead of fewer with countable plural nouns or
was rather than were in second conditional structure) (Cullen &
Kuo, 2007).
They survey 24 mainstream EFL textbooks that cover five
different proficiency levels to find out “whether learners are
made aware of such phenomena or whether they are only
presented with the forms traditionally felt to be correct”
(Cullen & Kuo, 2007, p. 371).
As summarized in the findings of their research, Cullen &
Kuo (2007) observe that Category A tail and head structures
occur only in some of the advanced-level textbooks while
ellipsis occurs only in two of the 24 books. Special attention
is given to the Category B features especially in the upper
intermediate and intermediate levels, as expressions like a kind
of/short of, a bit, I mean and you know are found to be common in the
textbooks. As for Category C features, they note that only
22
A corpus-based study of the frequency and meaningdistribution of five modals
four of the textbooks surveyed provide some feedback (Cullen &
Kuo, 2007).
Cullen & Kuo (2007) conclude that even though EFL
textbooks of the British market have attempted to incorporate
some aspects of spoken grammar they still come up with a
rather limited representation of the variety of features it
consists of. Especially where the educational goal is
communicative competency and engagement with native speakers,
the development of EFL textbooks needs to be targeted to a
more adequate representation and practice of conversational
grammar even in the case of pre intermediate and intermediate
learners (Cullen & Kuo, 2007).
Meunier & Gouverneur’s studies (2009) concentrate on the
use of phraseology in the context of EGP textbooks. Their
research is based on a huge corpus of the most widely used EGP
textbook series for both intermediate and advanced levels
(TeMA corpus). Apart from texts, TeMa corpus includes
transcripts of all taped material and vocabulary exercises
along with the guidelines, providing thus “a richness of
pedagogic input” (Meunier & Gouverneur, 2009, p. 187). Most
importantly, the TeMa corpus is “pedagogically annotated”
23
A corpus-based study of the frequency and meaningdistribution of five modals
(Meunier & Gouverneur, 2009, p. 186). Specific tags were
applied to each one of the vocabulary exercises and tasks
according to its purpose, e.g. matching, and to whether the
vocabulary used in the activity is given in advance. According
to Meunier & Gouverneur (2009), this kind of coding of the
vocabulary part of the corpus allows for exploitation of the
data from multiple perspectives. Researchers are able to
compare textbooks representing different levels in terms of
selection of vocabulary and investigate their input
(expressions, words) and the way it is practiced. The findings
serve as useful feedback for the textbook authors for future
improvements.
In her 2008 pilot study, Gouverneur reports a complete
inconsistency in the use of collocations across different
textbooks though a consistency in the use of particular tasks
to all levels. Her study also indicates the need for tasks
that will practice the cognitive skills of students such as
noticing, opening the way for potential improvements. Such
improvements could be accompanying the textbook with a CD ROM
which will include a variety of teaching and learning
materials such as concordance lines, corpus-based or data-
24
A corpus-based study of the frequency and meaningdistribution of five modals
driven activities and extra examples for further practice
(Meunier & Gouverneur, 2009).
The TeMa corpus may also help teachers to evaluate the
textbook design. An important issue is the type of
metalanguage used in textbooks to describe or categorize
vocabulary items and aspects of phraseology (Meunier &
Gouverneur, 2009) “A pilot study in 2007 has shown that the
metalanguage used in textbooks, and especially in the
guidelines to exercises is far too general and indirect”
(Meunier & Gouverneur, 2009, p. 196) As a result, the use of
more specific terms such as fixed idioms, collocations etc to
refer to lexical units in a textbook would better aid the
students conceptual organization and understanding (Meunier &
Gouverneur, 2009).
25
A corpus-based study of the frequency and meaningdistribution of five modals
CHAPTER THREE: Methodological framework
Materials
The aim of the current study is to determine the degree
of authenticity of the language used in the Greek State
school EFL textbooks, with specific attention to modal verbs.
In particular, the research question to be investigated is the
relevance of the use of five modals in the Greek State school
EFL textbooks with the real word use, as reflected in the
spoken part of BNC corpus.
To answer the research question stated above, we conducted a
corpus based analysis of the modals under investigation in
terms of frequency and occurrence of different meanings in
four school textbooks. The textbooks are: English 5th grade: Pupil’s
and activity book by E. Kolovou & A. Kraniotou (level A1/A1+),
English 6th grade: Pupil’s and activity book by E. Efremidou., E. Zoe- Repa
& F. Fruzaki (level A2-/A2), Think teen, 1st grade of junior high school (for
beginners): Student’s and activity book by E. Karagianni, V. Koui, & A.
Nikolaki and Think teen, 1st grade of junior high school (for advanced learners):
Student’s and activity book by E. Karagianni, V. Koui, & A. Nikolaki
(level BI-/B1).
26
A corpus-based study of the frequency and meaningdistribution of five modals
Both frequency and different meaning analysis were
carried out on the electronic versions of the four school
textbooks -our pedagogic corpora- with the help of specialized
software MonoConc 2.2. MonoConc 2.2 is a fast text searching
program with an excellent user-interface that includes many
features, such as the ability to create word lists (in both
alphabetical order and frequency order), generate concordance
output and give collocation information.
Procedure
At this stage we made use of the concordancer. Each one of the
pedagogic corpora was loaded into the program where we chose
the option of text search. Once we introduced the target word
(also called node word or keyword) in the text search, in this
case the modal verb under examination, we were presented with
all the occurrences of the word in the whole textbook.
Although the search was conducted automatically, the
interpretation of the findings called for manual work, as
every single use of the modal had to be examined thoroughly.
The reason for that is that, in order to identify the meaning
use of the modal in each occurrence, we need to consider the
full context or co-text in which it is found. The co-text is
27
A corpus-based study of the frequency and meaningdistribution of five modals
provided in the upper window of the concordance listing when
clicking on the particular sentence. The close examination of
each modal occurrence enabled us to exclude from the search
all the instances in which the modal was not used within a
sentence or a meaningful context (i.e grammatical
presentation). After finding out the number of the occurrences
corresponding to each meaning use we ranked each use in order
of frequency and drew conclusions about which meaning uses are
prioritized along the four textbooks.
To carry out the frequency analysis, we counted all the
matches, that is, the occurrences of each modal in the four
textbooks. The occurrences in which the modals were not used
in meaningful contexts were left out.
The results from both analyses were subsequently used (a)
for comparison among the four textbooks and (b) for comparison
between the pedagogic corpus and the BNC corpus in order to
draw conclusions regarding the use of particular modal verbs
by native speakers and their selection made by coursebook
writers for EFL teaching purposes.
For reasons of practicality and economy of effort and
time, the research was based on the British National Corpus
28
A corpus-based study of the frequency and meaningdistribution of five modals
(BNC) data provided by Ute Römer (2004). Römer limited her
research to spoken material as it is proven that modals occur
much more frequently in conversational English. In this
respect, she argued for the introduction of instances of
functional communicative use of language in the EFL textbooks,
traditionally targeted on written discourse. The same concern
leaded us to focus on the spoken discourse. As a result, the
study built on the frequencies of five modals (can, should, may,
might, must) in the 10 million word spoken part of the BNC
corpus. The different meanings of the modals in the BNC
corpus, identified by Römer, were also used as such in the
current study.
Evidently, the criteria for the selection of the
particular methodology were the availability of the electronic
corpora of the four school textbooks and the MonoConc software
which facilitated the whole process of the analysis and
allowed manipulation of the textbook corpora in various ways.
A manual analysis would be far too challenging. Among the
advantages of the particular methodology, we can list the ease
and speed of processing and the potential of expanding the co-
29
A corpus-based study of the frequency and meaningdistribution of five modals
text in a concordance line to obtain a better sense of the
meaning use.
30
A corpus-based study of the frequency and meaningdistribution of five modals
CHAPTER FOUR: Results
This chapter aims to present the outcomes of the
frequency and meaning distribution analyses carried out for
each of the textbooks (5th grade up to 1st grade of junior high
school). The findings will allow us to provide answers to the
research questions set in the first chapter.
Initially, we will focus on how the particular modals are
presented and described in each of the textbooks. In all four
textbooks the introduction of the modals follows a concrete
pattern. There is a mention of their meanings in a separate
grammar corner within the context of the unit and a more
detailed description of their usage in the grammar section
towards the end of the book.
In the fifth grade pupil’s book (p. 68) students are
acquainted with three modal verbs in the following order: can,
must and should. May and might may not be taught directly but are
found with a considerable frequency in the textbook.
Can is used to talk about ability/ties, must for
obligation and should for giving advice. In the grammar appendix
the three modals are listed in a similar way. There we also
31
A corpus-based study of the frequency and meaningdistribution of five modals
find a meaning distinction as it is clarified that must is
stronger than should.
To get a broader view of how these particular modals are
treated at this proficiency level (A1+/A2- level) and identify
any common aspects to this level, we need to examine the
meaning uses covered and the terms used to describe them in
other EFL textbooks of the same level. In addition, we need to
consider the order in which the modals are presented in the
textbooks as it may be reveal something about their status or
significance.
We may look for instance at two textbook series
circulating in the Greek market, viz. The Wonderkids (Longman)
and Upstream 1 (Express Publishing). In the former, the only
modals introduced are can and must. Can is presented first and is
found with the meanings of ability and permission. Must is
used to express obligation. In the grammar section the meaning
uses of the two modals are explained in the mother tongue
probably to ensure comprehension. Additionally, ways to
respond to questions for permission, using can, are suggested.
In Upstream 1, on the other hand, apart from can and must, we
also find should. The meanings given for must and should are
32
A corpus-based study of the frequency and meaningdistribution of five modals
similar to the 5th grade textbook, though it is noticeable that
can is addressed only with the permission meaning.
A comparison among the three textbooks indicates an
agreement in the selection of the meaning uses of must and
should (i.e. obligation, advice, respectively). The absence of
an explicit mention of may and might is common to all three
textbooks. This may well suggest that can and should, which are
found with higher frequency than might and may in the BNC, are
considered a good starting point. It is also worth mentioning
that the modals are introduced in a similar order (can, must,
should) in all of the textbooks.
At this point we also need to consider the frequency
distribution of the modals in the 5th grade textbook and
compare it to the one represented in the BNC corpus, to figure
out the prominence of each modal in grammar teaching.
33
A corpus-based study of the frequency and meaningdistribution of five modals
Figure 1. Relative frequency of modals in the 5th gradetextbook and the BNC corpus
The figure above demonstrates the relative frequency of
modals in the BNC corpus and the 5th grade textbook and allows
us to draw some interesting conclusions regarding the
treatment of each modal in the context of this EFL textbook. A
first glance would reveal that nearly all of the modals (can,
must, should, may) in the EFL textbook are used with much higher
34
A corpus-based study of the frequency and meaningdistribution of five modals
frequency than the BNC corpus. Significant is the case of can
where we have an extremely high frequency rate accounting for
over 50% of the occurrences of the overall modal occurrences.
Such an overuse would imply an uneven distribution of modals
along the textbook. What is more, irrespective of can that
occupies the first position in both textbook and the BNC, the
frequency order of the modals in the textbook seems to bear no
resemblance with that of BNC corpus (can, should, might, must, may)
illustrated in the figure. To mention the case of might, a
modal evidently frequent in spoken language that is found in a
prominently low frequency rate in the EFL textbook.
The results of the comparison between the BNC and the
textbook regarding the meaning distribution of modals are
collected in the Table 1. below. The first percentage under
each meaning category represents the frequency of the
particular meaning in the overall occurrences of this modal in
the textbook and BNC corpus accordingly.
Table 1. Different meaning distribution of can in the 5th
grade textbook and the BNC
ability possibilit permission
35
A corpus-based study of the frequency and meaningdistribution of five modals
can
y
5th grade textbook
81.4% 4.1% 14.4%
BNC
36% 31.5% 23.5%
Table 2. Different meaning distribution of may in the 5th
grade textbook and the BNC
may
possibility
permission
5th grade textbook
88.8% 11.1%
BNC
83% 13%
36
A corpus-based study of the frequency and meaningdistribution of five modals
Table 3. Different meaning distribution of must in the 5th
grade textbook and the BNC
must
obligation
inference/deduction
5th grade textbook
86% 14%
BNC 52% 39%
Table 4. Different meaning distribution of should in the 5th
grade textbook and the BNC
should
advice
hypothetic
5th grade textbook
100%
BNC 62.5%
30%
Table 5. Different meaning distribution of might in the 5th
grade textbook and the BNC
might
possibility
permission
5th grade textbook
100%
37
A corpus-based study of the frequency and meaningdistribution of five modals
BNC 95% 3.5%
First of all, we observe that the hypothetical meaning of
should accounting for a considerable number of occurrences in
the BNC corpus (30%) is not addressed at all in the textbook.
The same goes for might where we have no mention, even at a
lower degree, of the permission meaning. Looking at the
meaning distribution of may we may argue that both of its
meanings uses, highly frequent in speech, are covered
sufficiently in the textbook. In the case of must however, the
obligation meaning is accentuated over that of inference
deduction. Can is used primarily with the ability meaning while
the other two meanings are covered inadequately.
With respect to the 6th grade pupil’s book (level A2-/A2),
we observe that the modal verbs presented are can, may and
should. Might and may are present even though not commented on
directly. In this textbook, modals are taught in a rather
inductive way since students, instead of being presented with
explicit information right from the beginning, are invited to
guess or infer from the context the meaning expressed by each
modal. Only at the grammar section where all grammatical items
38
A corpus-based study of the frequency and meaningdistribution of five modals
introduced in the textbook are described in detail, we do find
information on their meaning uses. The meanings identified
are: ability, possibility and permission for can, possibility
and permission for may and advice for should. There is also some
information on the different usage of may and can when asking
for permission (can is preferred in informal speech).
Grammar Wheels 3 (level A2) published by Hillside Press
offers a lengthy account of the various meanings of modals in
the target language. Can is used for expressing ability and
for asking or giving permission. Might and may are both
addressed with the meaning of possibility, the latter with the
permission meaning as well. As for must, it is used in the
textbook either to denote personal obligation or order/command
while it can be also used for giving advice. There is also
reference in the deduction/inference meaning. In Top Score 3
(Oxford) on the other hand, we find only may, might and must with
the aforementioned meanings.
At this level we come across more detailed or thorough
descriptions of the five modals; in particular, more than one
meaning use of each modal verb is introduced and practiced.
However, even though we find a similar order of introduction
39
A corpus-based study of the frequency and meaningdistribution of five modals
concerning the modals present in all of the textbooks and a
consistency in the terms used to refer to their meanings, the
three textbooks do not have much in common. Grammar Wheels
addresses all modals under investigation with no exception
whereas Top Score 3 and the 6th grade textbook just three of
them.
Figure 2. Relative frequency of modals in the 6th grade textbook and the BNC corpus
40
A corpus-based study of the frequency and meaningdistribution of five modals
As far as the frequency of the modals in the textbook is
concerned, the figure above indicates an excessive use of can
over the rest of the modals when compared with their frequency
distribution in the BNC corpus. The order of frequency is
also completely different. May is found in the second place
with should, must and might following whereas in the BNC the
modals are sequenced as can, should, might, must and may. In
addition, most of the modals occur with much higher frequency
than in the BNC. Might and should are the only modals
underrepresented. Most striking perhaps is the case of might in
that its frequency rate in the textbook is essentially lower
than in the BNC.
Table 6. Different meaning distribution of can in the 6th
grade textbook and the BNC
can
ability possibility
permission
5th grade textbook
87.1% 10.9% 1.8%
BNC
36% 31.5% 23.5%
41
A corpus-based study of the frequency and meaningdistribution of five modals
Table 7. Different meaning distribution of may in the 6th
grade textbook and the BNC
Table 8. Different meaning distribution of must in the 6th
grade textbook and the BNC
must
obligation
inference/deduction
5th grade textbook
77.7% 22.2%
BNC 52% 39%
Table 9. Different meaning distribution of should in the 6th
grade textbook and the BNC
advice
hypothetic
may
possibility
permission
5th grade textbook
91.7 % 8.2 %
BNC
83% 13%
42
A corpus-based study of the frequency and meaningdistribution of five modals
should5th grade textbook
100%
BNC 62.5%
30%
Table 10. Different meaning distribution of might in the6th grade textbook and the BNC
might
possibility
permission
5th grade textbook
100%
BNC
95% 3.5%
As for the different meaning uses, in the textbook, can is
primarily used to express ability and less frequently
possibility. The frequency gap between the two meaning uses is
much wider than in the BNC corpus (31.5% possibility, 36%
ability). Additionally, the permission meaning used in 23% of
the cases in the BNC corpus is hardly acknowledged in the
textbook. The frequency rates for the meaning uses of may and
might seem to be more consistent with the BNC evidence. Yet,
this does not apply to should and might where once more we find
43
A corpus-based study of the frequency and meaningdistribution of five modals
no instances of other meaning uses other than the most
frequent ones.
None of the modals under investigation are presented in
the 1st grade of junior high school textbook (for beginners)
except for should. This absence can not be understood or
justified if there is supposed to be a continuity in the
grammar instruction of primary education that includes the
recycling of past grammar items along with the introduction of
new ones. Nevertheless and despite the fact that no modal is
explicitly taught or practised, the frequency count (Figure 3
below) reveals a recurrent use of the particular modals
throughout the textbook.
44
A corpus-based study of the frequency and meaningdistribution of five modals
Figure 3. Relative frequency of modals in the 1st grade ofjunior high school textbook (for beginners) and the BNC corpus
As we can see in the figure, there is a huge frequency
gap between can, on the one hand, and the other four modals, on
the other. It is also evident that the modals do not occur
with the frequency order suggested by the BNC (can, should, might,
must, may). For instance, might is the third most frequent one in
BNC (3.93%) whereas in the textbook it comes in the last
45
A corpus-based study of the frequency and meaningdistribution of five modals
place. Similarly, the frequency difference of might and must is
very small in the BNC but much greater in the textbook.
Additionally, it seems that most of the modals in the textbook
are used more regularly than in the BNC corpus. The only
exception is might where we notice a significant underuse.
Furthermore, the different meanings analysis (Table 3.)
demonstrates that only can occurs with a variety of meanings
(ability, permission, obligation). The rest of the modals are
used exclusively with their most frequent meaning use
according to the BNC corpus. This suggests that equally
important meaning uses are overlooked.
Table 11. Different meaning distribution of must in the 1st grade of junior high school textbook (for beginners) and the BNC
can
ability possibility
permission
5th grade textbook
78.6% 12.5% 8.8%
BNC
36% 31.5% 23.5%
46
A corpus-based study of the frequency and meaningdistribution of five modals
Table 12. Different meaning distribution of must in the 1st
grade of junior high school textbook (for beginners) and theBNC
Table 13. Different meaning distribution of must in the 1st
grade of junior high school textbook (for beginners) and theBNC
must
obligation
inference/deduction
5th grade textbook
100%
BNC 52% 39%
Table 14. Different meaning distribution of should in the 1st
grade of junior high school textbook (for beginners) and theBNC
advice
hypothetic
may
possibility
permission
5th grade textbook
100%
BNC
83% 13%
47
A corpus-based study of the frequency and meaningdistribution of five modals
should5th grade textbook
100%
BNC 62.5%
30%
48
A corpus-based study of the frequency and meaningdistribution of five modals
Table 15. Different meaning distribution of might in the 1st
grade of junior high school textbook (for beginners) and theBNC
might
possibility
permission
5th grade textbook
100%
BNC 95% 3.5%
Finally, the 1st grade of junior high school textbook for
advanced learners presents the following modal verbs: must, may,
might and should. According to the textbook, must signifies
something really important to happen while it is also used for
making guesses. Similarly, might/ may are used as to indicate
that we are not sure whether something is going to happen or
not. No particular information is provided about the meaning
of should although it is introduced implicitly under the
general category of ways of giving advice. In the grammar
section the same information is provided about the particular
modals. Must and may/might are termed as modals of certainty
and uncertainty, respectively . Should is presented as another
49
A corpus-based study of the frequency and meaningdistribution of five modals
way, among others, of giving advice (the others being why
don't you ….., a good idea is to ….., etc)
To see how these modals are treated in other textbooks
of A2+/B1- level we will have a look at Heroes 3 published by
Oxford Press and Cosmic published by Pearson Longman.
In Heroes 3, must is presented as a modal verb used for
obligation and rules and may/might as verbs that allow us to
say that something is possible in the future. Should is used for
giving advice or for indicating the right way to do something.
In Cosmic must is found to express certainty (deduction)
and obligation while it can also be used for giving advice.
Can has the meanings of ability and permission or request,
should that of giving advice and may that of possibility and
permission. Might is not presented at all.
Perhaps the common characteristic in all three EFL
textbooks of this level is the fact that they address, to a
certain extent, all five modals under investigation. In all of
them we find quite similar references regarding should, must and
may/might meanings. However, none of the EFL textbook accounts
for the permission meaning of might.
50
A corpus-based study of the frequency and meaningdistribution of five modals
To figure out the frequency of the particular modals in
the textbook and compare it with BNC, we have to consult
Figure 4 below.
51
A corpus-based study of the frequency and meaningdistribution of five modals
Figure 4. Relative frequency of modals in the 1st grade ofjunior high school textbook (for advanced learners) and theBNC
The frequency analysis of the particular modal verbs in
this textbook reveals an overemphasis on can in relation to the
other modals. More importantly, we find a discrepancy in the
frequency rates of the textbook compared to those of the BNC
corpus. Once again the frequency order of the modals does not
52
A corpus-based study of the frequency and meaningdistribution of five modals
correspond to the one suggested in the BNC corpus (can, should,
might, must, may).
Regarding the meaning distribution, we see that even
though all meanings of can occur in the textbook context, the
possibility meaning is underrepresented as it only covers a
minor percentage (7.1%). May, might and should are addressed only
with their primary meaning use, while must seems to be more
often used to express obligation.
Table 16. Different meaning distribution of can in the 1st
grade of junior high school textbook (for advanced learners)and the BNC
can
ability possibility
permission
5th grade textbook
83.1% 7.1% 9.7%
BNC
36% 31.5% 23.5%
Table 17. Different meaning distribution of may in the 1st
grade of junior high school textbook (for advanced learners)and the BNC
53
A corpus-based study of the frequency and meaningdistribution of five modals
Table 18. Different meaning distribution of must in the 1st
grade of junior high school textbook (for advanced learners)and the BNC
must
obligation
inference/deduction
5th grade textbook
88 % 12%
BNC 52% 39%
Table 19. Different meaning distribution of should in the1st grade of junior high school textbook (for advancedlearners) and the BNC
may
possibility
permission
5th grade textbook
100%
BNC
83% 13%
54
A corpus-based study of the frequency and meaningdistribution of five modals
should
advice
hypothetic
5th grade textbook
100%
BNC 62.5%
30%
Table 20. Different meaning distribution of might in the1st grade of junior high school textbook (for advancedlearners) and the BNC
might
possibility
permission
5th grade textbook
100%
BNC 95% 3.5%
To the same direction, it will be interesting to carry
out an intertextbook comparison, that is, to examine the
frequency rates of the five modals in each of the four
textbooks and compare the findings. A quick overview of the
four frequency figures reveals that the five modals are evenly
distributed along the different textbooks. For instance, can
occurs with the highest frequency in all of the textbooks. Must
55
A corpus-based study of the frequency and meaningdistribution of five modals
is the second most frequent in three of the textbooks (5th
grade, 1st grade of high school-beginner and advanced). The 6th
grade textbook diverges from the rest as we have a higher
percentage of frequency for may (18.7%) which is however close
to that of must (18.3%). Might, on the other hand, accumulates
the lowest percentage of use in all of the textbooks.
In terms of meaning distribution, all of the textbooks
seem to give precedence to the ability meaning of can, whereas
might and should are used exclusively with the meanings of
possibility and advice, respectively. May is used to express
permission in only two of the textbooks as in the rest it only
appears with the possibility meaning. Finally, must in its
overall occurrences is most regularly used to denote
obligation
56
A corpus-based study of the frequency and meaningdistribution of five modals
CHAPTER FIVE: Discussion
Apart from the textbook by textbook analysis, to study the
findings more comprehensively we proceeded in the compilation
of a single textbook corpus comprising the data of the four
textbooks. Figure 5 below exemplifies the relative frequency
of the modals in the BNC and the textbook corpus. The
comparison of the textbook corpus data with that of the spoken
BNC corpus will enable us to figure out how relevant is the
use of modals, as presented in the textbook, with their actual
use in speech.
57
A corpus-based study of the frequency and meaningdistribution of five modals
Figure 5. Relative frequency of modals in the BNC andthe textbook corpus
Evidently, the frequency distribution of the modals in
the textbook corpus differs quite a lot from the one found in
the spoken corpus of the BNC. As we see, can (70.4%) covers over
the majority of the occurrences of modals found in the
textbook corpus. This percentage seems disproportional
compared to that of the BNC corpus, this way indicating an
58
A corpus-based study of the frequency and meaningdistribution of five modals
overuse of the particular modal. Similarly, may, must and
should are used with a higher frequency in the textbook corpus
with the exception of might which presents a significant
underuse. Looking at the frequency percentages, we might also
suggest that the frequency order of the modals in all four
textbooks widely diverges from the one reflected in the BNC
corpus. In particular, must (11.1%) is prioritized over should
(8.15%), which actually seems to be more frequent in speech. It
is also worth mentioning that might (3.93%), which is in the
third position in the BNC scale of frequency, is used the
least in the textbook corpus (2.06%).
At this point, we will have to look for potential
explanations justifying these differences between real life
and the school textbook uses. The fact that certain modals are
introduced as new items and are used extensively in exercises
or reading texts for practice may possibly account for their
overuse over the rest. This is particularly evident in the 5th
and 6th grade textbooks where the three modals explicitly
taught: can, must, should (for 5th grade) and can, may, should (for 6th
grade) occur with much higher frequency.
59
A corpus-based study of the frequency and meaningdistribution of five modals
Furthermore, the complete disregard for the permission
meaning of might and the higher frequency percentages of can
over may with the same meaning use in all of the textbook may
indicate a preference for more informal ways of asking or
giving permission. What is more, the syllabi of the textbooks
of primary school grades seem to focus a lot on the students’
competence in talking about personal abilities and skills.
Presumably this focus explains the emphasis on the ability
meaning of can at the expense of others. Last but not least, we
could justify the absence of the hypothetical meaning of should
upon the assumption that the whole issue of making hypothesis
is to be addressed when students are presented with
conditional structures at a more advanced proficiency level.
The findings of the comparison concerning the five modals
under investigation can be also compared with the findings on
the Green Line corpus, constructed and used by Ute Römer in her
study on modals, expounded in Chapter 2. The main purpose of
her study was to examine if the use of modals in the textbooks
resembles the actual language use. Römer analyzed the
occurrence of 10 modals ( can, could, may, might, will, would, shall,
should, ought to, must) in the 10- million word subcorpus of the
60
A corpus-based study of the frequency and meaningdistribution of five modals
British National Corpus (BNC) to find out their frequency and
their different meaning distribution along with the syntactic
context in which they occur. She followed the same procedure
with the EFL German textbook, Learning English Green Line (Vols 1-6)
and the grammar book, Learning English Grundgrammatik.
The frequency analysis of the Green Line corpus revealed
that can has the highest frequency (101 occurrences). Will/’ll
followed with 95 occurrences, while the frequency of the other
modals (must, would, must, could, may, should, shall, ought to, might) varied
between 5 and 29 occurrences. The least frequent modal was
might (3 occurrences).
In terms of meaning distribution, can was found to express
ability at a 52.5% percentage. Lower in frequency meanings
were those of possibility (24.7%) and permission (22.8%). Might
was exclusively used with the meaning of possibility. May
expressed possibility in 58.3% of the cases and permission in
41.7%. Must was used for obligation in nearly all of its
occurrences (93.1%). A small percentage was covered by the
meaning of inference/deduction (6.9%) Should was mainly used for
advice (80%) and less frequently to express hypothetical
situations (20%).
61
A corpus-based study of the frequency and meaningdistribution of five modals
The comparison of the results with the BNC data proved
that there are several mismatches between the use of modals in
the EFL textbook context and the real word use. In terms of
frequency, can and must tend to be overused compared to should
and might. The order in which the modals are distributed in the
Greenline corpus is also proved to have little connection with
the BNC one.
Furthermore, for can expressing ability the percentages in
the Greenline corpus exceed those of BNC whereas the
possibility and permission meanings seem to be more evenly
distributed. Concerning may, the frequency gap between the
possibility and the permission meaning in the textbook corpus
is much higher than the BNC. Striking is also the fact that,
even though must expresses inference/deduction in 39% of its
occurrences in the BNC, the textbook corpus addresses this
meaning only in 6.9% of the occurrences. Might is used
exclusively to express possibility.
Comparing the findings of the Greenline corpus with our
textbook corpus, we observe that both corpora fail to reflect
the frequency order of modals provided in the spoken part of
the BNC (can, should, might, must, may). Noticeable is the fact that
62
A corpus-based study of the frequency and meaningdistribution of five modals
in both textbook corpora might is used the least. Other common
characteristics are: the overuse of can with an emphasis on the
ability meaning over the rest of the modals, the underuse of
the inference /deduction meaning of must, and the disregard for
the permission meaning of might. The only difference we find is
in the use of should in that the Greenline corpus accounts for
both meanings (hypothesis and advice) quite sufficiently.
The proved lack of representation of corpus data in the
EFL textbooks used in primary and secondary education in
Greece strengthens the need for a reassessment of the existing
teaching materials and the subsequent production of materials
closer to the authentic language use. Braun (2005) suggests
that only corpus-based descriptions of language are capable of
providing realistic and up-to date data which can be used as a
resource for the creation of interesting corpus materials.
Yet, corpus research has had a significant influence on
syllabus design with the development of corpus informed
materials such as the Touchstone series (Mc Carthy, Mc Carten &
Sandiford, 2005), an innovative series for adult and young
adult learners of English that draws on extensive research
into the Cambridge International Corpus of North American
63
A corpus-based study of the frequency and meaningdistribution of five modals
English (CIC). We also need to mention the Collins Birmingham
University International Language Database (COBUILD) project
that has produced a series of teaching materials based on
concordance data from a large corpus of the English language.
Therefore, the four EFL textbooks used in Greek primary
and secondary education need to be redesigned on the basis of
empirical data that offer information on the most frequently
occurring items and the choices native speakers tend to make
in speech. Of course, “frequency data alone cannot dictate
pedagogy” (Conrad, 2000, p. 550). The finding that a
particular meaning use of a modal is less frequent than
another does not mean that all teachers should necessarily
neglect it. A grammatical feature which does not appear
regularly in everyday language may have an important function
in specific kinds of discourse indeed. Worth mentioning is the
use of might and may to give or ask for permission,
particularly in formal registers. As a result, it is important
that a textbook account for these meanings as well. To the
same direction, greater emphasis should be placed on the
permission meaning of can, in that it is widely used in informal
everyday situations and thus important for functional language
64
A corpus-based study of the frequency and meaningdistribution of five modals
use outside the EFL classroom. Moreover, the textbooks should
explicitly state the semantic distinction in the use of can and
may for permission by referring to the degree of formality.
The possibility meaning of can needs to be stressed as well.
Finally, should has to be addressed with the hypothetical
meaning, common in speech according to the frequency rates of
BNC. It would, therefore, be considered an improvement to use
the different meanings of each modal in the textbook in
similar proportions with the BNC corpus.
Speaking of frequency, it would be advisable that the
modals be introduced in the order that are commonly found in
speech, that is, can, should, might, must, may. Even though the most
frequent modals such as can and should need to be prioritized
over less frequent ones, the presence of less frequent modals
in the textbook is necessary as an implicit exposure would be
beneficial for students.
Finally, regarding the presentation of the modals in the
textbooks, we may argue for a move towards a meaning based
categorization. An example of this kind of categorization we
find in the 6th grade textbook where should is presented among
other ways of giving advice. Similarly, the
65
A corpus-based study of the frequency and meaningdistribution of five modals
inference/deduction meaning of must could be introduced and
contrasted along with the negative inference/deduction meaning
of can’t. Accordingly, Can and may/ might could be listed as formal
or informal ways to give or ask for permission. In addition,
there should be a provision about the recycling of previously
taught modals as the students advance to a higher grade. This
should be seriously taken into consideration particularly in
the 1st grade of junior high school textbook syllabus.
66
A corpus-based study of the frequency and meaningdistribution of five modals
CHAPTER SIX: Conclusion
The primary aim of this study was to conduct a corpus-based
comparative analysis regarding the frequency and meaning
distribution of five basic modals (can, must, should, may, might)
used in the EFL textbooks of Greek state primary and secondary
schools ( 5th grade- 1st grade of junior high school).
The analysis proved that the presentation and treatment
of the particular modals in the context of the textbooks
differed considerably from the use of those verbs as this was
suggested by the spoken subcorpus of BNC. Major findings in
terms of frequency were an overuse of certain modals at the
expense of others and an incongruity with the BNC corpus data
in the order of presentation of the modals. As far as the
modal meaning uses are concerned, certain meaning uses were
emphasized over equally important ones while there were cases
where highly frequent meanings were underrepresented or even
left out.
First and foremost, primary/secondary school teachers and
textbook designers are expected to take into consideration the
gap between the content of the teaching materials and the
naturally occurring language. Secondly, they need to closely
67
A corpus-based study of the frequency and meaningdistribution of five modals
cooperate in making principled decisions concerning the
selection and the grading of modals and their meaning uses on
the basis of corpus evidence.
At this point however, it should be made clear that the
results of the current study have shed light only to a very
limited part of the content of the state school EFL textbooks,
that is, the modal verbs. To draw reliable conclusions about
authenticity in the EFL textbooks, the prospective textbook
designers and teachers will have to conduct a more extensive
research that would address other teaching items as well.
Therefore, a lot of corpus-driven work still has to be
done if the teaching content is to become more focused on
conditions of real world use. A significant part of that work
lies in the hands of the corpus linguists and researchers
dealing with the empirical study of language with computer-
assisted techniques. The new generation of EFL teachers who
have received training in corpus-research and are capable of
conducting their own small scale corpus investigations will
also play key role in addressing similar issues. Finally, once
widely disseminated and exploited by the textbook designers,
these descriptive studies have the potential to revolutionize
68
A corpus-based study of the frequency and meaningdistribution of five modals
the English language curriculum enabling students to use a
more authentic kind of language. Consequently, if corpus
linguistics is to have an optimum impact on language learning
and lead to concrete pedagogical applications, the cooperation
between researchers and teaching professionals should be
reinforced.
69
A corpus-based study of the frequency and meaningdistribution of five modals
References
Aston, G. (2000). Corpora and language teaching. In L.
Burnard & T. McEnery, (Eds.), Rethinking language pedagogy from
a corpus perspective: Papers from the third international conference on
teaching and language corpora vol 2, (pp. 7-17). Frankfurt: Peter
Lang.
Bernardini, S. (2004). Corpora in the classroom: An overview
and some reflections on future developments. In J.
Sinclair, (Ed.), How to use corpora in language teaching, (pp. 15-
36). Amsterdam and Philadelphia, PA: John Benjamins.
Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics:
Investigating language structure and use. Cambridge: Cambridge
University Press.
Braun, S. (2005). From pedagogically relevant corpora to
authentic language learning contents. ReCALL, 17(1), 46-
47.
Chambers, A. (2010). What is data driven learning? In A. O’
Keefe, & M. McCarthy (Eds.), The Routledge Handbook of Corpus
Linguistics (pp. 345-356). Routledge
70
A corpus-based study of the frequency and meaningdistribution of five modals
Conrad, S. (2000). Will corpus linguistics revolutionize
grammar teaching in the 21st century? TESOL Quarterly, 34(3),
548-560.
Cullen, R. & Kuo, I-C. (2007). Spoken grammar and ELT course
materials: A missing link? TESOL Quarterly, 41(2), 361 - 386.
Efremidou., Zoe- Repa, E., & Fruzaki, F. (2009a). English 6th
grade: Pupil’s book. Athens: Ministry of Education and
religion.
Efremidou., Zoe- Repa, E., & Fruzaki, F. (2009b). English 6th
grade: Pupil’s workbook. Athens: Ministry of Education and
religion.
Gilquin, G., & Granger, S. (2010). How can data driven
learning be used in language teaching? In A. O’ Keefe, &
M. McCarthy, (Eds.), The Routledge Handbook of Corpus Linguistics
(pp. 359-370). Routledge.
Granger, S. (2004). Computer learner corpus research: Current
status and future prospects. In U. Connor and T. Upton
(Eds.), Applied corpus linguistics: A multidimensional perspective (pp.
123-145). Amsterdam: Rodopi.
Hunston, S. (2002). Corpora in applied linguistics. Cambridge:
71
A corpus-based study of the frequency and meaningdistribution of five modals
Cambridge University Press.
Johns, T. (1991). From printout to handout: Grammar and
vocabulary teaching in the context of data-driven
learning. In T. Odlin (Ed.), Perspectives on pedagogical grammar
(pp. 293-313). Cambridge: Cambridge University Press.
Kaltenböck, G., & Mehlmauer-Larcher, B. (2005). Computer
corpora and the language classroom: on the potential and
limitations of computer corpora in language teaching.
ReCALL, 17(1), 65–84.
Karagianni, E., Koui, V., & Nikolaki, A. (2009a). Think teen, 1st
grade of junior high school (for beginners): Student’s book. Athens:
Ministry of Education and religion.
Karagianni, E., Koui, V., & Nikolaki, A. (2009b). Think teen, 1st
grade of junior high school (for beginners): Student’s workbook. Athens:
Ministry of Education and religion.
Karagianni, E., Koui, V., & Nikolaki, A. (2009a). Think teen, 1st
grade of junior high school (for advanced learners): Student’s book. Athens:
Ministry of Education and religion.
Karagianni, E., Koui, V., & Nikolaki, A. (2009b). Think teen, 1st
grade of junior high school (for advanced learners): Student’s workbook.
Athens: Ministry of Education and religion.
72
A corpus-based study of the frequency and meaningdistribution of five modals
Kennedy, G. (1998). An introduction to corpus linguistics. Harlow:
Longman.
Kolovou, E., K., & Kraniotou, A. (2009a). English 5th grade: Pupil’s
book. Athens: Ministry of Education and religion.
Kolovou, E., K., & Kraniotou, A. (2009b). English 5th grade: Activity
book. Athens: Ministry of Education and religion.
McEnery, T., Xiao, R. & Tono, Y. (2006). Corpus-based language
studies: An advanced resource book. London: Routledge.
Meunier, F., & Gouverneur, C. (2009). New types of corpora for
new educational challenges. In J. Aimer (Ed.), Corpora and
language teaching (pp. 179-198). Amsterdam: John Benjamins.
O’ Keefe, A., McCarthy, M., & Carter, R. (2007). From corpus to
classroom: Language use and language teaching. Cambridge:
Cambridge University Press.
Osborne, J. (2002). Top down and bottom-up approaches to
corpora in language teaching. In U. Connor and T. Upton
(Eds.), Applied corpus linguistics: A multidimensional perspective (pp.
251-265). Amsterdam: Rodopi.
Römer, U. (2004a). A corpus-driven approach to modal
auxiliaries and their didactics. In J. Sinclair (Ed.),
How to use corpora in language teaching, (pp. 15-33). Amsterdam:
73
A corpus-based study of the frequency and meaningdistribution of five modals
John Benjamins.
Römer, U. (2004b). Comparing real and ideal language learner
input: The use of an EFL textbook corpus in corpus
linguistics and language teaching. In G. Aston, S.
Bernardini, D. Steward (Eds.), Corpora and language learners
(pp. 151-160). Amsterdam: John Benjamins.
Schmitt, N. (2000). Vocabulary in language teaching. Cambridge:
Cambridge University Press.
74