Close encounters of the third code: Quantitative vs. Qualitative analyses in corpus-based...
-
Upload
univ-lille -
Category
Documents
-
view
0 -
download
0
Transcript of Close encounters of the third code: Quantitative vs. Qualitative analyses in corpus-based...
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
Close encounters of the third code: quantitative vs. qualitative
analyses in corpus-based translation studies
Rudy Loock
Université Lille Nord de France & UMR 8163 Savoirs, Textes, Langage,
CNRS
“[T]he translation itself […] is essentially a third
code which arises out of the bilateral
consideration of the matrix and target codes: it is,
in a sense, a sub-code of each of the codes
involved.” (Frawley 1984: 168)
“I guess you've noticed something a little strange
with Dad. It's okay though. I'm still Dad.”
(Roy Neary, Close Encounters of the Third Kind)
Abstract
The aim of the article is to complement a quantitative study on existential
constructions in French and English, both in translated and original texts
and based on the exploitation of comparable corpora (Cappelle & Loock
2013). What this article shows is that such an overall quantitative
approach should be complemented with a more qualitative approach, for
two main reasons: (i) overall quantitative results provide only a general
view on the differences between translated texts and original texts, hiding
subtle but crucial variations; (ii) the use of comparable corpora does not
provide any information on the strategies used by translators and on the
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
translation process itself. The article also provides suggestions for
implications in translator training and translation quality assessment.
Keywords: corpus-based translation studies, parallel corpus, comparable
corpus, existential constructions, quality, translator training
1. Introduction
In the same vein as many case studies within the corpus-based translation
studies (CBTS) framework, a recent study, Cappelle & Loock (2013), has
provided a comparative, quantitative analysis on existential there+BE
constructions in English (1a) and their counterpart il y+AVOIR
constructions in French (1b) (see section 1.1 for a complete definition of
the object of study), both in translated and original texts. Using both self-
collected and pre-existing electronic corpora of post-1980 fiction texts,
Cappelle & Loock (2013) have shown that significant statistical
differences exist not only between French and English (inter-language
differences) as is generally claimed by many contrastive-linguistics and
translation textbooks (e.g. Ballard 2003, 2004; Chuquet & Paillard 1989;
Guillemin Flescher 1981), but also between translated English (from
French) and original English, as well as between translated French (from
English) and original French (intra-language differences).
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
(1) a. There was a dog in the garden.
b. Il y avait un chien dans le jardin.
Although the present article does not question the validity of the results
that have been obtained in this study, I claim, in line with studies like
Johansson (2007) for example, that such an overall quantitative analysis
does not suffice to provide a complete analysis, and thus aim to go further
by providing a finer-grained, qualitative analysis by answering two
important research questions concerning inter-language differences that
Cappelle & Loock's (2013) study does not address:
1. While the overall quantitative study shows that there are fewer
occurrences of existential constructions in French translated texts than
in the corresponding English original texts (see section 1 for a
summary of the results), can this be explained by the sole ‘omissions’
of existential constructions through some syntactic reorganization or
deletion? Or are those omissions actually more important numerically
while we also find the ‘addition’ of existential constructions between
English source texts and the French target texts?1 A similar question
holds for texts translated from French to English, as there are more
existential constructions in translated English than in original French.
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
2. What kind of strategies are used by translators when translating
sentences with existential constructions, from French to English but
also from English to French, when a literal translation – that is a
translation with an existential construction in the target language – is
not retained?
More generally, the article aims to show that a thorough analysis of a
specific linguistic phenomenon within the CBTS framework requires both
quantitative and qualitative analyses to uncover the specific characteristics
of translated language, and thus requires resorting to both comparable
(monolingual and bilingual) and parallel corpora. This is an approach
advocated by Johansson (2007) in particular – actually suggested as early
as the 1990s – who claims that multilingual corpora like the English-
Norwegian Parallel Corpus (ENPC) are necessary for analyses to be
comprehensive. However, many studies within the CBTS framework still
solely rely on overall quantitative analyses to establish differences
between original and translated languages (see section 4 for examples).
The article thus aims to reassert that different types of corpora are indeed
required: (a) multi-million-word bilingual comparable corpora of original
language to uncover inter-language differences; (b) multi-million-word
monolingual comparable corpora of original and translated language to
uncover significant differences between translated and original texts, thus
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
uncovering specific features of what has been called interlanguage
(Selinker 1972) or third code (Frawley 1984), and (c) bi-directional
parallel corpora, even small – large electronic corpora of translated texts
are not available for all languages yet – for a qualitative analysis such as
the one described here. As the present study discusses Cappelle & Loock’s
(2013) results obtained with a literary corpus, the parallel corpus used for
this analysis also consists of literary, post-1980 texts.
The article is organized as follows. In section 2 the results of Cappelle
& Loock’s study are summarized and its limits are shown. In section 3, I
provide the results of my complementary analysis, based on a sample of
Cappelle & Loock’s self-collected corpus of original and translated
novels: first of all, detailed results on the addition, omission and
preservation of existential constructions in French to English and English
to French translations; second, results concerning translation strategies
when a literal translation is not retained. In section 4 I discuss the results
of the analysis within the context of other studies in the CBTS field and
show how important such qualitative analyses are, and how important the
use of parallel corpora is. Finally, in a concluding section, I discuss the
implications of my findings for translator training and translation quality
assessment.
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
2. Cappelle & Loock’s quantitative analysis
2.1. Corpus material
Cappelle & Loock’s (2013) study relies on data extracted from both a self-
collected corpus and a combination of existing large electronic corpora,
one of the aims of their study being a comparison of the results obtained
with the two types of corpora. The self-collected corpus consists of 12
post-1980 novels: 3 in original French, 3 in original English, and their
translation in the other language, for a total of nearly 1 million words (see
section 3.1. for more information). The electronic corpus consists of
extracts from the British National Corpus (Davies 2004) for original
English, from Frantext2 for original French, and from the Translational
English Corpus3
for translated English. As no electronic corpus of
translated French is currently available, no electronic data was used for
translated French (see below). The compiled electronic corpus consists of
15,909,312 words for the BNC sub-corpus, 11,365,626 words for the
Frantext sub-corpus, and 995,143 words for the TEC sub-corpus (with
only texts translated from French being retained), all sub-corpora
consisting of post-1980 fiction texts to allow for relevant comparisons.
The study thus resorts both to a parallel corpus and comparable
corpora as defined for example by Johansson (2007), Granger (2010), or
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
Teubert (1996): while comparable corpora are “corpora in two or more
languages with the same or similar composition”, a parallel corpus
consists of “a bilingual or multilingual corpus that contains one set of texts
in two or more languages” (Teubert 1996: 245, 247). However, the results
retained by Cappelle & Loock for their study are those obtained from their
electronic, comparable corpora (BNC, Frantext, TEC), except for
translated French for lack of such an electronic corpus. Cappelle & Loock
thus seem to suggest that the use of a smaller, parallel corpus is not
necessary, or even cannot be trusted, for a CBTS-oriented corpus study
that aims to uncover inter- and intra-language differences (see Cappelle &
Loock 2013: 269-270).
The tertium comparationis, that is “the common ground on which two
languages can be compared to establish (dis)similarities” (Ebeling 1998:
603), retained in Cappelle & Loock’s study is existential constructions,
that is there-constructions in English and il y a-constructions in French,
considered to be a case of translational equivalence (see below). The study
only covered existential constructions with the verbs be in English and
avoir in French, thus discriminating between such constructions and
presentational constructions, which include other verbs (e.g. come, arrive,
live, remain), as the two constructions actually differ (see Birner & Ward
1998).4 In addition, the so-called prepositional use of il y a in French
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
(Grevisse 1986) to define a period of time (il y a quatre ans [‘4 years
ago’]) was also discarded.
2.2. Results
What the corpus study has shown is that both significant inter-language
(French vs. English) and intra-language (translated vs. original English;
translated vs. original French) differences exist. A comparison between
the BNC and Frantext sub-corpora shows that there is a highly significant
difference between the two languages, with 2,581 occurrences per million
words (pmw) of there-constructions in original English, as opposed to
1,079 occurrences pmw of il y a-constructions in original French (z-ratio =
87.55; p<0.0001). The same difference, although less pronounced, is
revealed by the self-collected parallel corpus.
As far as intra-language comparison is concerned, the study has
revealed a highly significant difference between original and translated
English, with 1,376 occurrences pmw of there-constructions in the TEC
sub-corpus (z-ratio = 21.92; p<0.0001). Once again, the self-collected
corpus shows the same intra-language difference, although the difference
is less pronounced. Concerning the intra-language comparison for French,
given the unavailability of electronic data for translated texts, the study
compared results obtained from Frantext with those of the self-collected
sub-corpus of translated French. The observed difference is highly
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
significant (z-ratio = 7.19; p<0.0001) with 1,535 occurrences pmw in
translated French. The results are summarized in Figure 1.
What this case study has uncovered is that existential constructions
are much more frequent in English than in French, in line with many
contrastive-linguistics and translation textbooks. It also shows that
existential constructions are less frequent in translated English than in
original English, while they are more frequent in translated French than in
original French. Such intra-language results suggest a case of source-
language interference (translationese), although an interpretation
involving translation universals as defined by Baker (1993, 1995, 1996)
could also be involved (see Cappelle & Loock (2013) for a discussion; I
shall not tackle this issue here).
Figure 1. Results from Cappelle & Loock (2013): existential constructions
in English and French, translated and original (per million words)
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
From a methodological point of view, the study has also revealed that
results obtained with a small self-collected corpus consisting of 12 novels
and results obtained with larger electronic corpora might differ, sometimes
very significantly. What Cappelle & Loock (2013) conclude about this is
that large electronic corpora provide more reliable results than a random
collection of original novels and their translated counterparts.
The corpus study thus has strong methodological implications, while
many CBTS studies actually overlook the question of the
representativeness of the corpus they use (see special issue of Across
Languages and Cultures 13(2), 2012, on this question). This naturally
casts doubt on the results for intra-language differences for French, doubt
that will only be cast off when an electronic corpus of translated French
becomes available for the community,5 but I will consider here that the
results are valid.
2.3. Limits
Although the quantitative analysis described above uncovers significant,
interesting results, it remains limited in that it does not provide the
complete picture for the numerical results and importantly, it does not
provide any information on the strategies used by translators to deal with
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
sentences with existential constructions, two results that could be of high
interest for translator training.
In particular, if Cappelle & Loock (2013) have shown that there is a
‘loss’ of existential constructions for texts translated from English into
French, they do not specify whether this loss corresponds solely to the
omission of such constructions or whether there are actually more
omissions, accompanied by a certain number of additions in translated
French. It is not because we know the frequency of existential
constructions in original English and in French translated from English,
even for the same texts, that we know for sure that the difference is only
due to a loss; it could actually correspond to a (higher) loss and a (small)
gain. The same issue holds for English translated from French, where the
use of existential constructions is higher than in original French: can this
‘gain’ be explained by additions only? Comparable corpora, whether
monolingual or bilingual, compare separate collections of texts. Although
they “represent ordinary language use in each language and should allow
safe conclusions on similarities and differences between the languages
compared” (Johansson 2007: 10), they cannot provide such information.
Moreover, comparable corpora cannot help uncover the strategies that
are used by translators when translating sentences with existential
constructions, in particular from English to French, when a literal
translation is not retained. Is the non-canonical word order that
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
characterizes an existential construction, in which the semantic subject is
postposed, while dummy there stands for the syntactic subject (see
Lambrecht 1994; Birner & Ward 1998), simply replaced with an
unmarked, SVO word order (1a’) or is/are there some specific type(s) of
syntactic reorganization that translators resort to on a systematic basis?
(1) a’. A dog was in the garden.
These two limits are to be tackled in this article thanks to an analysis of a
parallel corpus that provides source texts and target texts for English and
for French. I consider such an analysis to be qualitative as it goes beyond
overall results by providing a finer-grained description of the phenomenon
under study. The aim here is to show that even when such a corpus is
modest in size, it can provide valuable information that tempers overall
results obtained from multi-million-word comparable corpora, information
that can then be used for translator training and translation quality
assessment.
3. A qualitative analysis
3.1. Corpus material and data extraction
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
The dataset analyzed here is a sub-corpus extracted from the self-collected
corpus used in Cappelle & Loock’s study. This analysis allows for a direct
comparison between the quantitative analysis and a qualitative analysis, as
the aim is to show that the latter complements the former. Cappelle &
Loock’s (2013) corpus also seems particularly relevant in that regardless
of the novel being considered or the direction of translation, the frequency
of existential constructions is systematically lower for the novels in
French than for the novels in English (see Cappelle & Loock 2013, Figure
7).
The complete corpus is described in Table 1, taken from Cappelle &
Loock’s article.
Title and author/translator Reference Language Type Year of publication
Number of words
The Gun Seller, H. Laurie HLNT English original 1996 106,866
How to Be Good, N. Hornby NHNT English original 2001 83,843 Harry Potter and the
Philosopher’s Stone, J.K.
Rowling
JKRNT English original 1997 78,546
Tout est sous contrôle, J.-L. Piningre
HLT French translated 2009 105,894
La bonté : Mode d’emploi, I.
Chapman
NHT French translated 2003 85,608*
Harry Potter à l’école des
sorciers, J.-F. Ménard
JKRT French translated 1998 85,240
Empire of the Ants, M. Rocques BWT English translated 1996 96,162* Windows on the World, F.
Wynne
FBT English translated 2004 63,135*
If Only It Were True, J. Leggatt MLT English translated 2005 65,157*
Les Fourmis, B. Werber BWNT French original 1997 89,988 Windows on the World, F.
Beigbeder
FBNT French original 2003 65,749
Et si c’était vrai, M. Lévy MLNT French original 2000 62,898*
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
Table 1. List of translated and original novels in the English-French self-
collected corpus (* = manual word count based on 20 pages chosen
randomly – electronic texts were used only when available)
The dataset used here is based on extracts from the 12 novels used in the
quantitative analysis. Specifically, I collected manually the first 25
occurrences of the existential construction in each of the 6 original novels
and in each of the 6 translated novels. The qualitative analysis thus rests
on a first sample of 300 existential constructions. To this were added the
translations of the original existential constructions (25x6, that is 150
translated sentences) as well as the original sentences that resulted in the
use of an existential construction in the translated texts (25x6, that is
another 150 original sentences). The dataset is summarized in Table 2.
Although this may seem a small number of tokens, I claim that such a
modest dataset suffices to show that the overall results must be tempered
and can provide valuable information in terms of translation strategies.
Sample A
A1 The first 25 existential constructions in each original novel
(HLNT, NHNT, JKNT, BWNT, FBNT, MLNT)
150 tokens
A2 The corresponding 25 translations of the data in A1 (HLT, NHT,
JKT, BWT, FBT, MLT)
150 tokens
Sample B
B1 The first 25 existential constructions in each translated novel
(HLT, NHT, JKT, BWT, FBT, MLT)
150 tokens
B2 The counterpart of the data in B1 in original texts (HLNT,
NHNT, JKNT, BWNT, FBNT, MLNT)
150 tokens
Table 2. Dataset used for the qualitative analysis
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
If in total the dataset amounts theoretically to 600 tokens, some of them –
but not all of them – were actually identical. For instance, in case of a
literal translation as in (2), the original sentence and the translated
sentence were listed twice (e.g. (2a) is listed both in A1 and B2 sub-
corpora; (2b) is listed both in A2 and B1 sub-corpora), but this is exactly
what the qualitative analysis is about: check in detail, in a parallel corpus,
how many existential constructions are preserved, gained or lost in the
translation process. As a counter-example, (3a) and (3b) are listed only
once.
(2) a. And the reason I had to drop it was because there was a
dying man in the room. (HLNT)
b. Parce qu’il y avait un moribond dans la pièce. (HLT)
‘Because there was a dying man in the room’
(3) a. There was no point in worrying Mrs. Dursley. (JKNT)
b. Il était inutile d'inquiéter Mrs Dursley pour si peu. (JKT)
‘It was useless to worry Mrs Dursley for so little’
3.2. Method
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
The analysis took place in two stages. First I listed the first 25 existential
constructions in each of the 6 original novels (sample A1) and checked
how many of them were translated literally in the target language, that is,
by their direct counterpart in the other language (there+BE by il y+AVOIR
and vice-versa) in the other language. This made it possible to determine
the level of mutual translatability for the structure, as defined by
Altenberg (1999).6 I also listed the first 25 existential constructions in
each of the 6 translated novels (sample B1) and checked how many of
them were the results of a literal translation of a sentence with an
existential there- or il y a-construction. This enabled me to check, for each
translation direction, from English to French and from French to English,
what proportion of existential constructions was omitted, added, or
preserved after the translation process. Such results cannot be obtained
when exploiting comparable corpora. In spite of the relatively modest size
of the dataset and the small number of novels,7 such results show the non-
systematicity of the general patterns observed in Cappelle & Loock
(2013).
The second stage consisted in tagging each translated sentence in A2
to uncover some regular strategies used by translators, something which
once again cannot be covered by the use of a comparable corpus. This
type of analysis is in line with e.g. Vinay & Darbelnet (1958/1995),
Ballard (2003, 2004), Chuquet & Paillard (1989), Guillemin-Flescher
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
(1981), which aim at providing translators with guidelines resting on
actual language use. Such results are meant to have a pedagogical value,
uncovering some usage constraints that need to be dealt with by
translators if they want to provide natural-sounding translations. This type
of analysis is also in line with Ebeling (1998, 2000), who stresses the
importance of parallel corpora to provide valuable information on
differences and similarities between languages, as well as on translation
behavior.
3.3. Results
3.3.1. Stage 1 results
The results, which are summed up in Table 3, shed new light on those
obtained in Cappelle & Loock (2013). Overall, the results reveal low
mutual translatability (Altenberg 1999; see note 6) of 46% for the whole
dataset, which clearly shows that the two structures are not completely
translationally equivalent, as was already revealed by the significant inter-
language and intra-language differences.
Let us first consider English to French translations, a translation
direction for which we expect a certain loss of existential constructions if
we consider the data from the quantitative analysis based on the
comparable corpora. What can be noticed is that, in addition to the
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
omission of 48 (out of the 75) occurrences of existential constructions,
there is a significant proportion of preserved existential constructions (27),
and above all, quite a significant number of additions (42 occurrences for
the three novels). While the proportion of omissions vs. preservations is
predictable (only 36% of there-sentences are translated with il y a-
sentences), the last result, namely the proportion of additions, is quite
unexpected, as it goes against the general trend uncovered by Cappelle &
Loock (2013), which showed that existential constructions are fewer in
number in French, whether translated or original. Examples listed in (4),
(5), and (6) show an instance of each case (preservation, omission and
addition, respectively):
(4) a. And I make some weak what-kind-of-girl-do-you-think-I-am
joke, but of course there's nothing much to joke about, really.
(NHNT)
b. Et moi je riposte mollement par une vanne du style “Vous me
prenez pour qui?”, mais bien sûr, il n'y a pas de quoi rire.
(NHT)
‘And I gently respond with a joke like "Who do you think I
am?", but of course, there is nothing to laugh about’
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
(5) a. At the prices they were charging for liquor, there were
probably only a couple of dozen people in the world who
could afford to stick around for a second drink. (HLNT)
b. Vu le prix des boissons alcoolisées, à peine plus d’une
vingtaine de personnes en ce bas monde avaient les moyens
de commander un deuxième verre ici. (HLT)
‘Given the prices of alcoholic beverages, hardly about twenty
people in the world could afford ordering a second drink
here’
(6) a. "Mr Woolf," I said, "before you name a place, make sure you
can book it for at least ten people". (HLNT)
b. Monsieur Woolf, avant de me proposer un restaurant,
assurez-vous qu’il y a de la place pour une dizaine de
personnes environ. (HLT)
‘Mr. Woolf, before you suggest a restaurant, make sure that
there is enough room for about ten people’
Are in the source text
and are preserved in
the target text
(PRESERVATION)
Are in the source text
but are not kept in
the target text
(OMISSION)
Are not in the source
text but are added in
the target text
(ADDITION)
English >
French HL 6 19 13
JKR 11 14 16
NH 10 15 13
Total En>Fr 27 (36%) 48 (64%) 42
French > ML 12 13 15
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
Table 3. General translation strategies concerning source and target
existential constructions
As far as French to English translations are concerned, what we notice is
that a higher proportion of existential constructions are preserved (42
occurrences out of 75, that is 56%), which is in line with Cappelle &
Loock’s results: a gain is to be expected when the target language is
English and the source language is French. But the expected gain is
strengthened by the addition of a high number of there-constructions (34
of the first 75 there-constructions in the translated novels were not in the
original texts). On the other hand, we also notice quite a significant
proportion of omissions (33 occurrences), which means that the (minimal)
gain hides a certain number of omissions. Once again, the picture
provided by the quantitative analysis of comparable corpora is actually
incomplete.
These results are discussed more thoroughly in section 4, but what
they inevitably show is that the differences uncovered through the use of
comparable corpora cannot provide the full picture on the distribution of a
specific construction in translated and original texts. Even the analysis of a
small parallel dataset shows that the general patterns observed in Cappelle
English
BW 15 10 11
FB 15 10 8
Total Fr>En 42 (56%) 33 (44%) 34
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
& Loock (2013) are not systematic. A finer-grained approach with a bi-
directional parallel corpus is therefore necessary.
3.3.2. Stage 2 results
The second aim of the present study here is to uncover strategies used by
translators when a literal translation of sentences with existential
constructions is not retained. What is meant by literal translation is the use
of a there-construction to translate a sentence with an il y a-construction,
and vice-versa. Such a definition of ‘literal translation’ is legitimized by
the fact that the il y a-construction covers “much the same range of
functions as English existential there” (Bergen & Plauché 2005: 23) and
that the direct equivalents of English existential there-sentences are il y a-
constructions in French (cf. e.g. Lambrecht 1994: 178). The two
constructions are thus considered to be translationally equivalent as they
“convey the same ideational and interpersonal and textual meanings”
(James 1980: 178), providing a relevant tertium comparationis. This
means that the use of other existential constructions (see below) is not
considered to be a literal translation as the translator does not resort to a
direct equivalent, but to a translation shift as defined by Catford (1965):
“departure from formal correspondence in the process of going from SL to
TL” (Catford 1965: 73).
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
The uncovering of translators’ strategies is important for two reasons.
First of all, it is helpful in defining linguistic differences between the two
languages, which offer different strategies in addition to there-/il y a-
constructions to introduce new referents in the discourse. Second, it can
have a pedagogical interest by providing translation students with
guidelines that allow them to respect inter-language differences (ceteris
paribus, speakers of English resort more to existential constructions than
speakers of French) and to use other constructions to convey the same
logico-semantic meaning. This second aspect is in line with pedagogical
approaches like Vinay & Darbelnet (1958/1995), Chuquet & Paillard
(1989), Guillemin-Flescher (1981), Ballard (2003, 2004), which all
provide detailed descriptions of strategies/procedures used by translators
when resorting to non-literal translations, as well as generalizations as to
why and how these should take place.
In order to achieve this goal, I tagged the 150 sentences in A2
translated from sentences with there- and il y a-constructions depending
on the choice made by the translators of the 6 novels. I divided the
translations into 5 groups: (i) literal translation (see (4) above); (ii)
syntactic reorganizations (to be divided into different types); (iii) deletion;
(iv) idiomatic expressions; (v) non-translation.
The general results are provided in Table 4. The different strategies
used by translators are detailed in the remaining of this section. Note that
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
these results are not meant to be representative of all translated literary
texts – the parallel corpus is far too small for that. The aim of the analysis
here is to uncover some strategies used by translators to deal with
existential constructions.
Table 4. Specific translation strategies used to translate existential there-/il
y a-constructions
3.3.2.1. Syntactic reorganizations. A syntactic reorganization consists in a
modification of the syntactic organization of the source sentence, entailing
a change of viewpoint or perspective (Vinay & Darbelnet 1958/1995).
This translation procedure may concern the translation of an active
sentence by a passive sentence, a change in constituent order or a change
in the starting term/topic of the sentence. As regards the translation of
existential there-/il y a-constructions, three main recurring types of
syntactic reorganizations have been observed:
Literal
translation
Syntactic
reorganizati
ons
Deletion Idiomatic
expressions
Non
translated Total
NH 10 11 4 0 0 25
JKR 11 9 5 0 0 25
HL 6 15 4 0 0 25
Total Eng >
Fr 27 35 13 0 0 75
FB 15 8 2 0 0 25
BW 15 7 2 1 0 25
ML 12 7 2 3 1 25
Total Fr >
Eng 42 22 6 4 1 75
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
(i) the use of another existential construction such as an impersonal
subject followed by the verbs avoir or have (7); the use of a
presentational there-construction, involving a more specific verb than
be/avoir (8), the use of a perception verb with an animate subject to
establish the (in)existence of an entity (9), or the use of the verb avoir
with an inanimate object (10);
(ii) the insertion of a verb semantically associated with the notional
subject in the original existential construction, now the syntactic
subject in the translated sentence (11-13);
(iii) an inversion, whether locative or subject-verb inversion (14-15).
(7) a. I said drop it. There’s a guy dying in here. (HLNT)
b. J’ai dit : assez ! On a un mec en train de crever, là ! (HLT)
‘I said: enough! We have a guy dying, here’
(8) a. There is a certain group of people who will respond to one of
the most basic and pertinent of questions with a mild and
impatient blasphemy. (NHNT)
b. Il existe une certaine catégorie de gens qui répondent aux
questions les plus élémentaires et les plus pertinentes par ce
genre de blasphème bénin et impatient. (NHT)
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
‘There exists a certain category of people who answer the
most basic and pertinent questions with that kind of mild and
impatient blasphemy’
(9) a. Il n’y a pas des caméras et des micros partout ! (MLNT)
‘There aren’t any cameras or microphones everywhere’
b. Can you see any cameras or microphones? (MLT)
(10) a. Il y a plus de mille restaurants en ville. (HLNT)
‘There are more than a thousand restaurants in the city’
b. This city has more than a thousand restaurants. (HLT)
(11) a. There was a horrible smell in the kitchen the next morning
when Harry went in for breakfast. (JKRNT)
b. Le lendemain matin, au petit déjeuner, une odeur
pestilentielle se dégageait d'une grande bassine posée dans
l'évier de la cuisine. (JKRT)
‘The next morning, at breakfast, a horrible smell was coming
from a big bowl in the kitchen sink’
(12) a. The lights were out, but the curtains were wide open and
there was plenty of light coming in from the street. (HLNT)
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
b. Les lampes étaient éteintes mais, avec les rideaux ouverts, les
lumières de la rue éclairaient suffisamment la scène. (HLT)
‘The lights were out but, with the open curtains, the lights
from the street were illuminating sufficiently the scene’
(13) a. Inside there was a wall three feet away. (HLNT)
b. Un mur barrait l’intérieur à moins d’un mètre de la porte.
(HLT)
‘A wall closed off the inside less than one meter away from
the door’
(14) a. Cachée entre les cintres, il y avait une femme, les yeux clos,
apparemment envoûtée par le rythme de la chanson, faisant
claquer son pouce contre son index, elle fredonnait. (MLNT)
‘Hidden between the hangers, there was a woman, eyes
closed, apparently transported by the rhythm of the song,
snapping her thumb and forefinger together, she was
humming’
b. Huddled on the floor beneath the hangers sat a young
woman, eyes closed, seemingly transported by the rhythm of
the song, humming along to it and snapping her fingers.
(MLT)
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
(15) a. There was one of those pauses that you know is going to be
long as soon as it starts. (HLNT)
b. A suivi un de ces silences dont on anticipe tout de suite la
longueur. (HLT)
‘Followed one of those silences whose length you
immediately predict’
3.3.2.2. Deletion. Deletion, which is not listed as a translation procedure
by Vinay & Darbelnet but has been listed as such by many other
researchers (e.g. Ballard 2003, 2004; Schjoldager 2008), can be defined as
the non-rendering of a source-text item in the target text. Examples (16)-
(19) illustrate such a strategy. Note that deletion and non-translation are
not the same phenomenon (see 3.3.2.4).
(16) a. Papa, il y a un monsieur qui est venu tout à l'heure pour
relier un livre. (BWNT)
‘Dad, there is a man who came earlier to bind a book’
b. Dad, a man came while you were out. It was something about
binding a book. (BWT)
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
(17) a. 'I mean,' she said, 'there’s no ambulance coming here. Jesus.'
(HLNT)
b. Je veux dire qu’aucune ambulance n’arrivera ici. Mon
Dieu ! (HLT)
‘I mean that no ambulance is going to come here. My God!’
(18) a. When Mr. and Mrs. Dursley woke up on the dull, gray
Tuesday our story starts, there was nothing about the cloudy
sky outside to suggest that strange and mysterious things
would soon be happening all over the country. (JKRNT)
b. Lorsque Mr et Mrs Dursley s'éveillèrent, au matin du mardi
où commence cette histoire, il faisait gris et triste et rien dans
le ciel nuageux ne laissait prévoir que des choses étranges et
mystérieuses allaient bientôt se produire dans tout le pays.
(JKRT)
‘When Mr. and Mrs. Dursley woke up in the morning of the
Tuesday when this story begins, the weather was grey and
miserable and nothing in the cloudy sky suggested that strange
and mysterious things were about to happen all over the
country’
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
(19) a. J'ai lu dans un article qu'en cuisine, il y avait deux frères
inséparables qui travaillaient ensemble et nettoyaient les
crustacés, côte à côte. (FBNT)
‘I read in an article that in the kitchen, there were two
inseparable brothers who worked together and cleaned
shellfish, side by side’
b. In a magazine article, I read that two inseparable brothers
worked side by side cleaning shellfish in the kitchens. (FBT)
In each case, the existential construction is not rendered as such in the
target language. The non-canonical word order is transformed into a
canonical, unmarked, SVO word order. The notional subject is now the
syntactic subject of the sentence, but contrary to examples of type (11-13)
no specific verb semantically associated with the subject is added, as a
verb was already present in the source sentences within an embedded
clause post-modifying the notional subject (relative clause, participle
clause, or infinitive clause).
3.3.2.3. Idiomatic expressions. Idiomatic expressions can be defined as
“complex bits of frozen syntax, whose meanings are more than simply the
sum of their individual parts” (Nattinger & De Carrico 1992: 32). They
represent prefabricated strings of words with fixed meanings, and
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
generally are not subject to the principle of compositionality. They are
often difficult to translate and are to be treated as one single unit of
meaning whose counterpart in the target-language is generally
syntactically and semantically different from the idiom in the source
language. Existential constructions can be found in idiomatic expressions,
as in (20) or (21), where the il y a-construction serves its traditional
discourse function (see above) but is part of a fixed phrase that cannot be
translated literally.
(20) a. Qu'est-ce qu'il y a ? (BWNT)
‘What is it that there is?’
b. What's the matter? (BWT)
(21) a. Avoue-le, bon sang, que tu es sorti avec elle, puisque cela
fait vingt ans comme tu le dis, il y a prescription maintenant !
(MLNT)
‘Admit it, for God’s sake, that you went out with her, since
it’s been 20 years as you say, there is ‘prescription’ now!
b. Admit it, for God's sake, you went out with her! It's been
fifteen years, like you said – the statute of limitations has
expired! (MLT)
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
There is no real strategy here from the part of the translators, who have no
choice but to find a pragmatic equivalent in the target language,
independently of the syntax and semantics of the source phrase.
3.3.2.4. Non-translated. As usual, when one compares source texts and
target texts, one realizes that for some reason, translators have decided not
to translate parts of sentences. This is only rarely the case in the dataset,
which contains solely one occurrence of such a decision, where the part of
the sentence that contains the existential construction is not translated
(22). I do not consider this to be a translation strategy but mention the
example to illustrate this possibility when investigating parallel corpora. It
should be noted that this case is different from deletion, as the translator
does not even preserve the logico-semantic meaning of the part of the
sentence with the existential construction.
(22) a. Elle ajouta que c’était facile, il n’y avait que des cinq.
(MLNT)
‘And she added that it was easy, there were only fives’
b. Easy to remember. (MLT)
4. Discussion of the results
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
The answers to the two research questions that were formulated in the
introduction are thus the following:
1. The lower number of occurrences of existential constructions in
French translated texts as opposed to English original texts cannot be
explained by the sole omission of a certain number of existential
constructions in the translation process. The shifts (mostly syntactic
reorganization and deletion) are actually more important in number,
but are accompanied by the addition of il y a-constructions that do not
correspond to there-constructions in the source texts. In the same
way, the higher frequency of existential constructions in English
translated texts as opposed to French original texts actually
corresponds to a certain number of additions, but accompanied by a
significant number of omissions.
2. The strategies used by translators in case of non-literal translation (at
least those uncovered by the analysis of the dataset) are quite
restricted in number, with only two major types of procedures:
syntactic reorganization (use of another existential construction or of
a presentational construction; insertion of a verb with the notional
subject used as syntactic subject; inversion) and deletion, while
idiomatic expressions represent a non-strategy as translators are given
no choice but to find the equivalent in the target language.
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
These answers stem from the complementing of an overall, quantitative,
analysis based on multi-million word corpora with a finer-grained,
qualitative analysis of a small parallel dataset. My aim was to show that
the results provided in Capelle & Loock (2013) were actually incomplete
and, in line with approaches like Johansson's (2007), to show that the
combination of comparable and parallel corpora is necessary for a
thorough cross-linguistic analysis of a specific linguistic phenomenon (see
also Altenberg & Granger (2002) or Granger (2003) for a discussion on
the advantages and disadvantages of the different types of corpora).
Ever since the advent of corpus-based approaches to translation
phenomena in the 1990s, overall quantitative approaches have regularly
been criticized and researchers have been warned against a sheer
numerical analysis such as those performed in contrastive linguistics
studies. For instance, Mason (2001) claims that such approaches have
important limitations in spite of their immense potential for the discipline
(Mason 2001: 78). According to him, while the quantitative study of
source texts in each language can provide information about norms of
language behavior, we cannot make do without studying source texts and
their translations to collect information on translation behavior if we want
to avoid vague generalizations. Mason clearly advocates a combination of
quantitative and qualitative analyses, and cautions against putting too
much emphasis on the scientific, quantitative nature of CBTS. Such
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
criticism is in line with Tymoczko (1998), who also warns against the
subjectivity present in numerical approaches, which can never be
completely objective: according to her, the research questions that are
formulated by researchers determine the results, and researchers should
not exploit corpora “merely to prove the obvious or give confirming
quantification where none is really needed, in short to engage in the type
of exercise that after much expense of time and money ascertains what
common sense knew anyway” (Tymoczko 1998: 657-658).
Although I agree with Tymoczko when she warns researchers against
regarding as sacred quantitative results obtained in corpus-based
translation studies, it is hard to agree with her when she claims that
corpora should not be used in order to “prove the obvious”. The obvious is
actually sometimes the hardest to prove scientifically (one just needs to
think about the fact that it was obvious that apples but not the Moon fell
onto the ground before the difference was explained by Newton’s
Universal Law of Gravitation).
In translation studies, research before the advent of corpus-based
studies in the 1990s and 2000s often relied on intuition – which itself
relied on experience – to try and define differences between languages
(see for instance Vinay & Darbelnet (1958/1995) as typical of such an
approach to English and French). Although such approaches have drawn
an impressive list of differences between languages, suggesting some
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
translation strategies to take them into account, we now know that some
differences, in Vinay & Darbelnet for instance, were over-generalized like
the (non-)use of connectors in French translated from English for instance
(see Mason 2001).
Mason’s and Tymoczko’s objections to a sheer numerical approach to
translated texts are in direct correlation with criticism on the type of
corpora that are to be used in CBTS. The sole use of monolingual
comparable corpora like the BNC and the TEC for English has been
criticized, for instance, as they cannot provide information about the
translation process itself and cannot shed light on translation possibilities
or idiosyncrasies. Caution is thus expressed by Malmkjaer (1998) for
example, for whom the use of small bilingual corpora allows for the
uncovering of information that the use of comparable corpora does not
provide in spite of their many advantages.
Finally, the exploitation of quantitative results to prove the existence
of translation universals as defined by Baker has been severely criticized
these recent years. Beyond criticism regarding the existence of the
universals themselves (see e.g. House 2008; Corpas Pastor et al. 2008;
Mauranen & Kujamäki 2004), some researchers claim that translation
universals, if they exist, cannot be confirmed by the sole exploitation of
general numerical differences between translated and original texts (e.g.
Becher 2010). De Sutter, Delaere & Plevoets (2012) in particular show
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
that “the relationship between translated texts and original texts is not
mono-, but multidimensional in nature, like most other linguistic
products” (De Sutter, Delaere & Plevoets 2012: 327).
Such criticism is taken into consideration by some researchers who
combine both a qualitative and a quantitative analysis of their data, with
some very interesting results. One must note that as early as the 1990s,
approaches like the ones advocated by Johansson (2007) or Ebeling
(1998) for instance, recommended the use of multilingual corpora so as to
combine comparable and parallel corpora, with a view to not only
determining inter- and intra-language differences, but also to accounting
for the observed differences as well as translators’ strategies. Many
studies now combine the exploitation of both types of corpora, thus
providing more detailed information on their object of study. A typical
example of such analyses is the one performed by Ramón & Labrador
(2008), who provided an analysis on the translation of -ly adverbs of
degree from English to Spanish, which can be translated literally with -
mente adverbs but also non-literally with a prepositional phrase or an
adjective phrase for instance. By using first a comparable monolingual
corpus to compare the use of -mente adverbs in original and translated
texts and then a parallel corpus to investigate the translational options
retained by translators (literal translation with a -mente adverb,
prepositional phrase, adjective, etc.), Ramón & Labrador provide
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
significant results and reveal cross-linguistic differences that can be
exploited for translator training and translation quality assessment. For
instance, translators have a tendency to overuse -mente adverbs when
translating -ly adverbs but in varying proportions depending on the type of
text (fiction vs. non-fiction), on the negative quality of -ly degree adverbs
(e.g. awfully, badly, poorly), or on the type of degree expressed by the
adverb (Ramón & Labrador noticed an overuse of literal translation for
absolute or very high degree adverbs like absolutely or extremely). They
also noticed that omission often occurred (8% of cases in their corpus),
and the finer-grained observation of their parallel corpus allowed them to
establish when such a translation strategy is likely to occur.
Nevertheless, such an approach is not the rule in the CBTS field. For
instance, general studies on differences between translated texts and
original texts such as Xiao (2010) on Chinese and Laviosa (1997; 1998)
on English, which both focus on the lexical features of translated texts
(lexical density; choice of high-frequency words), while providing
instructive, crucial information on general intra-language differences, do
not provide any specific information about the systematicity or non-
systematicity of the general patterns observed. This stems from the fact
that these studies rely on comparable monolingual corpora, namely TEC
vs. BNC for Laviosa (1997, 1998), ZJU Corpus of Translational Chinese
vs. Lancaster Corpus of Mandarin Chinese for Xiao (2010), and their aim
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
is to investigate such general intra-language differences, which are
interpreted in terms of translation universals. Their results, though highly
valuable, cannot be fully exploited for translator training and translation
quality assessment.
In the same way, the kind of qualitative approach adopted in Olohan’s
(2003) study on contractions in translated and original English is not
sufficient, despite the fact that it goes beyond an overall intra-language
comparison by investigating the influence of linguistic context on the
characteristics of translated texts. The study does provide some kind of
qualitative analysis by showing that the initial quantitative analysis hid
important variations in the use of contractions and that the co-text and the
discourse functions of contracted forms are important in that they both
have an influence on the (non-)use of contractions in texts. However, it
does not provide the full picture as it relies on a comparable corpus only.
This is of course irrelevant for Olohan’s study itself on contracted forms,
but what such an approach cannot tell us, in spite of its irrefutable
qualitative aspect, is whether there are cases that go against the general
trend. This could be the case if a full form of a source language where
contractions are used would be translated with a contracted form.
Contractions are naturally an irrelevant tertium comparationis for such an
analysis, but the present study on existential constructions, which exist in
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
the two languages under consideration, has clearly shown that the general
results hid a significant number of counter-examples to the general trends.
With this article the aim was therefore to reaffirm that studies that
combine the exploitation of different types of corpora bring more
information than general numerical approaches that fail to provide the
whole picture to define precisely the characteristics of the third code. I
have shown that a finer-grained approach thanks to a parallel corpus, even
a small one, helps to (i) uncover the (non-)systematicity of the observed
general patterns and (ii) provide information about translators’ strategies.
5. Concluding remarks and future research
The article complements Cappelle & Loock’s (2013) study by providing a
finer-grained approach to the translation of existential constructions from
English to French and from French to English. I have shown that overall
results can hide more subtle translation strategies than what a general
approach based on comparable corpora can suggest. This study has also
provided information on the ways existential constructions can be
translated when a literal translation is not to be retained.
The next step now is to try and determine when each strategy is used
(correlation with linguistic context and/or pragmatic function): when
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
translators decide not to translate an existential construction literally, is
this due to a specific syntactic reorganization (e.g. a syntactically heavy
NP following there+BE or il y+AVOIR in the source-text, or a lexical gap
in the target language) or is this due to some more general usage
constraint? What are the linguistic factors that guide translator’s choices
between the different strategies at their disposal? Is there a link with the
different subtypes of existential constructions (see e.g. Huddleston &
Pullum’s (2002) distinction between ‘bare’ and ‘extended’ existential
constructions, depending on the presence/absence of an adverbial
extension or a modifier after the notional subject in the sentence)? It
would also be interesting to compare the results with results for other
genres, such as press articles: do we find similar inter-/intra-language
differences? Do translators use the same strategies when translating
existential constructions? The first question would be particularly relevant
if one wants to determine the role of translation universals, which by
definition are not supposed to be genre-dependent.
The finer-grained results discussed are important if we consider that such
case studies can provide crucial information for both translator training
and translation quality assessment. Although there is no consensus on this
idea among researchers and translation trainers, in particular for literary
translation, results obtained within the field of corpus-based translation
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
studies can have powerful pedagogical value. While there is general
consensus on the fact that translated language differs from original
language in a number of ways, researchers still disagree on the
interpretation of these results: are they in some way a natural, translation-
inherent, phenomenon, revealing translation universals which, as such,
cannot be avoided, or should we say that translated texts that differ from
original texts in a specific language are translations that can be improved?
In other words, should we consider translated language as variation
comparable to dialectal variation or should we consider that the over-
representation or under-representation of a given linguistic construction,
say existential constructions, means that the quality of the translation
should be improved? From an even more general perspective, should we
consider that translated language is intrinsically different and represents
what researchers have called a third code (but it’s okay, though, it’s still
the same language) or should we consider that “the utopian goal is to
make it virtually impossible to tell the translation from an original text in
that language” (Teubert 1996: 241)? If we consider the second option,
then the results of this case study on existential constructions for instance
can be used for translator training. While students should be aware of
inter-language differences between English and French and be aware of
translation procedures other than literal translations that can be used to
translate existential constructions, they should also be warned that
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
systematically adding existential constructions when translating from
French to English or omitting them when translating from English to
French is a mistake.
Evaluating the quality of translations, which is probably the most
debated question concerning translation (see Depraetere 2011 or Loock et
al. 2013 for recent experiments), could also be achieved objectively
through the checking of the number of existential constructions in the
translations to determine whether they are under-/over-represented or not
in relation to original language. All these questions related to translator
training and quality assessment are left open for future research within the
CorTEx research project (Corpus, Translation, Exploration) at the
University of Lille.
Acknowledgements
I would like to thank first of all Bert Cappelle, with whom I conducted the
quantitative study without which this additional analysis could not have
been achieved. I also thank the participants in the CorTEx project (Gert
De Sutter, Michaël Mariaule, and Cindy Lefebvre-Scodeller in particular)
for their comments and suggestions on the study. Finally, I thank
anonymous reviewers for comments and suggestions that greatly
contributed to improving the quality of this paper. Any remaining errors
are naturally mine.
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
References
Altenberg, B. 1999. “Adverbial connectors in English and Swedish:
Semantic and lexical correspondences”. In H. Hasselgård &
S. Oksefjell (eds), Out of Corpora. Studies in Honour of Stig
Johansson, 249-268. Amsterdam: Rodopi.
Altenberg, B. & Granger, S. 2002. “Recent trends in cross-linguistic
lexical studies”. In B. Altenberg & S. Granger (eds), Lexis in
Contrast. Corpus-based Approaches, 3-48. Amsterdam-Philadelphia:
John Benjamins.
Baker, M. 1993. “Corpus linguistics and translation studies: Implications
and applications”. In M. Baker et al. (eds), Text and Technology, 233-
250. Amsterdam-Philadelphia: John Benjamins.
Baker, M. 1995. “Corpora in Translation Studies: An overview and some
suggestions for future research”. Target 7(2): 223-243.
Baker, M. 1996. “Corpus-based translation studies: The challenges that lie
ahead”. In H. Somers (ed.), Terminology, LSP and Translation.
Studies in Language Engineering in Honour of Juan C. Sager, 175-
186. Amsterdam-Philadelphia: John Benjamins.
Ballard, M. 2003. Versus: repérages et paramètres, vol. 1. Paris: Ophrys.
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
Ballard, M. 2004. Versus: des signes au texte, vol. 2. Paris: Ophrys.
Becher, V. 2010. “Abandoning the notion of 'translation-inherent'
explicitation: Against a dogma of translation studies”. Across
Languages and Cultures 11(1): 1-28.
Bergen, B.K. & Plauché, M.C. 2005. “The convergent evolution of radial
constructions: French and English deictics and existentials”.
Cognitive Lingusitics 16(1): 1-42.
Birner, B. J. & Ward, G. 1998. Information Status and Noncanonical
Word Order. Amsterdam-Philadelphia: John Benjamins.
Cappelle, B. & Loock, R. 2013. “Is there interference of usage
constraints? A frequency study of existential there is and its French
equivalent il y a in translated vs. non-translated texts”. Target 25(2):
252-275.
Chuquet, H. & Paillard, M. 1987. Approche linguistique des problèmes de
traduction anglais-français. Paris: Ophrys.
Corpas Pastor, G., Mitkov, R., Afzal, N., & Pekar, V. 2008. “Translation
universals: do they exist? A corpus-based NLP study of convergence
and simplification”. In Proceedings of the Eighth Conference of the
Association for Machine Translation in the Americas (AMTA-08).
Waikiki: Hawaii.
Davies, M. 2004. BYU-BNC: The British National Corpus. Available at
http://corpus.byu.edu/bnc/.
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
De Sutter, G., Delaere, I. & Plevoets, K. 2012. “Lexical lectometry in
corpus-based translation studies: Combining profile-based
correspondence analysis and logistic regression modelling”. In M.P.
Oakes & J. Meng (eds), Quantitative Methods in Corpus-based
Translation Studies. A Practical Guide to Descriptive Translation
Research, 325-346. Amsterdam-Philadelphia: John Benjamins.
De Sutter, G., Goethals, P., Leuschner, T. & Vandepitte, S. 2012.
“Towards methodologically more rigorous corpus-based translation
studies”. Across Languages and Cultures 13(2): 137-143.
Depraetere, I. (ed.). 2011. Perspectives on Translation Quality. Berlin-
Boston: Walter de Gruyter.
Ebeling, J. 1998. “Contrastive linguistics, translation, and parallel
corpora”. Meta 43(4): 602-615.
Ebeling, J. 2000. “Using translations to explore construction meaning in
English and Norwegian”. In S. Johansson & S. Oksefjell (eds),
Corpora and Cross-linguistic Research: Theory, Method and Case
Studies, 169-195. Amsterdam-Atlanta: Rodopi.
Frawley, W.1984. “Prolegomenon to a theory of translation”. In
W. Frawley (ed.), Translation: Literary, Linguistic and Philosophical
Perspectives, 159-175. Newark: University of Delaware Press.
Reprinted in L. Venuti (ed.) 2000. The Translation Studies Reader,
250-263. London-New York: Routledge.
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
Granger, S. 2003. “The corpus approach: A common way forward for
contrastive linguistics and translation studies?” In S. Granger, J. Lerot
& S. Petch-Tyson (eds), Corpus-based Approaches to Contrastive
Linguistics and Translation Studies, 17-29. Amsterdam-New York:
Rodopi.
Granger, S. 2010. “Comparable and translation corpora in cross-linguistic
research: Design analysis and applications”. Journal of Shanghai
Jiaotong University 2: 14-21.
Grevisse, M. 1986. Le bon usage, 12th
edition by André Goosse. Paris:
Duculot.
Guillemin-Flescher, J. 1986. Syntaxe comparée du français et de
l’anglais. Problèmes de traduction. Gap-Paris: Ophrys.
House, J. 2008. “Beyond intervention: Universals in translation”. Trans-
kom1: 6-19.
Huddleston, R. & Pullum, G. 2002. The Cambridge Grammar of the
English Language. Cambridge: Cambridge University Press.
Izquierdo, M., Hofland, K. & Reigem, Ø. 2008. “The ACTRES parallel
corpus: An English-Spanish translation corpus”. Corpora 3(1): 31-41.
Johansson, S. 2007. Seeing through Multilingual Corpora. On the Use of
Corpora in Contrastive Studies. Amsterdam-Philadelphia: John
Benjamins.
James, C. 1980. Contrastive Analysis. London: Longman.
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
Lambrecht, K. 1994. Information Structure and Sentence Form.
Cambridge: Cambridge University Press.
Laviosa, S. 1998. “Core patterns of lexical use in a comparable corpus of
English narrative prose”. Meta 43(4): 557-570.
Laviosa-Braithwaite, S. 1997. “Investigating simplification in an English
comparable corpus of newspaper articles”. In K. Klaudy & J. Kohn
(eds), Transferre Necesse Est. Proceedings of the 2nd International
Conference on Current Trends in Studies of Translation and
Interpreting, 5-7 September 1996, Budapest, Hungary, 531-540.
Budapest: Scholastica.
Loock, R., Mariaule, M. & Oster, C. 2013. “Traductologie de corpus et
qualité: étude de cas”. Paper presented at the Tralogy 2 Conference,
Paris, France, 17-18 January 2013.
Malmkjaer, K. 1998. “Love thy neighbour: Will parallel corpora endear
linguists to translators?” Meta 43(4): 534-541.
Mason, I. 2001. “Translator behaviour and language usage: Some
constraints on contrastive studies”. Journal of Linguistics 26: 65-80.
Mauranen, A. & Kujamäki, P. (eds). 2004. Translation Universals. Do
they Exist? Amsterdam-Philadelphia: John Benjamins.
Nattinger, J. R. & de Carrico, J. S. 1992. Lexical Phrases and Language
Teaching. Oxford: Oxford University Press.
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
Olohan, M. 2003. “How frequent are the contractions?” Target 15(1): 59-
89.
Ramón, N. & Labrador, B. 2008. “Translation of -ly adverbs of degree in
an English-Spanish parallel corpus”. Target 20(2): 275-296.
Schjoldager, A. 2008. Understanding Translation. Århus: Academica.
Selinker, L. 1972. “Interlanguage”. International Review of Applied
Linguistics 10: 209-241.
Teubert, W. 1996. “Comparable or parallel corpora?” International
Journal of Lexicography 9: 238-264.
Tymoczko, M. 1998. “Computerized corpora and the future of translation
studies”. Meta 43(4): 652-658.
Vinay, J.-P. & Darbelnet, J. 1958/1995. Comparative Stylistics of French
and English: A Methodology for Translation (Translated and edited
by J. C. Sager & M.-J. Hamel). Amsterdam-Philadelphia: John
Benjamins.
Xiao, R. 2010. “How different is translated Chinese from native Chinese?
A corpus-based study of translation universals”. International Journal
of Corpus Linguistics 15(1): 5-35.
This is the final draft. For the published version, please check:
Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.
1 The terms ‘addition’ and ‘omission’ are taken from Johansson (2007:
26). 2 http://www.frantext.fr
3 http://www.llc.manchester.ac.uk/ctis/research/english-corpus
4 Linguists and grammarians have used two labels in the literature to refer
to English there-sentences, ‘existential’ and ‘presentational’ constructions,
which has led to some confusion. We use here, as in Cappelle & Loock
(2013), the label ‘existential’ to refer to there sentences with the verb be
(in all its possible forms) and use ‘presentational’ for there-constructions
with other verbs. 5 Some electronic corpora of translated French do exist, like the PLECI
corpus (Poitiers-Louvain Echange de Corpus Informatisés) for instance
(see http://www.uclouvain.be/en-cecl-pleci.html), but have remained
modest in size and unavailable for other researchers. A bigger corpus of
translated literary texts is currently being built within the CorTEx project
(Corpus, Translation, Exploration) and is to contain ca. 5 million words. 6 ‘Mutual translatability’ refers to Altenberg (1999)’s definition of the
mutual correspondence (MC) statistical measure, which provides the
frequency with which a pair of items from two languages A and B (here
there-constructions and il y a-constructions) are translated into each other
in a bi-directional corpus. This is calculated as a percentage: ((frequency
in A translated texts + frequency in B translated texts)/(frequency in A
source texts + frequency in B source texts)) x 100. 7 The corpus compiled for Cappelle & Loock (2013) contains full novels
instead of extracts. Although this choice leads to a small number of novels
where other studies use corpora made of extracts from a higher number of
novels, diminishing the risk of author/translator idiosyncrasies, this
allowed a direct comparison between the number of existential
constructions in the English texts as opposed to the number of existential
constructions in the French texts, systematically lower in French than in
English, regardless of the translation direction.