Close encounters of the third code: Quantitative vs. Qualitative analyses in corpus-based...

49
This is the final draft. For the published version, please check: Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86. Close encounters of the third code: quantitative vs. qualitative analyses in corpus-based translation studies Rudy Loock Université Lille Nord de France & UMR 8163 Savoirs, Textes, Langage, CNRS “[T]he translation itself […] is essentially a third code which arises out of the bilateral consideration of the matrix and target codes: it is, in a sense, a sub-code of each of the codes involved.(Frawley 1984: 168) I guess you've noticed something a little strange with Dad. It's okay though. I'm still Dad.(Roy Neary, Close Encounters of the Third Kind) Abstract The aim of the article is to complement a quantitative study on existential constructions in French and English, both in translated and original texts and based on the exploitation of comparable corpora (Cappelle & Loock 2013). What this article shows is that such an overall quantitative approach should be complemented with a more qualitative approach, for two main reasons: (i) overall quantitative results provide only a general view on the differences between translated texts and original texts, hiding subtle but crucial variations; (ii) the use of comparable corpora does not provide any information on the strategies used by translators and on the

Transcript of Close encounters of the third code: Quantitative vs. Qualitative analyses in corpus-based...

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

Close encounters of the third code: quantitative vs. qualitative

analyses in corpus-based translation studies

Rudy Loock

Université Lille Nord de France & UMR 8163 Savoirs, Textes, Langage,

CNRS

“[T]he translation itself […] is essentially a third

code which arises out of the bilateral

consideration of the matrix and target codes: it is,

in a sense, a sub-code of each of the codes

involved.” (Frawley 1984: 168)

“I guess you've noticed something a little strange

with Dad. It's okay though. I'm still Dad.”

(Roy Neary, Close Encounters of the Third Kind)

Abstract

The aim of the article is to complement a quantitative study on existential

constructions in French and English, both in translated and original texts

and based on the exploitation of comparable corpora (Cappelle & Loock

2013). What this article shows is that such an overall quantitative

approach should be complemented with a more qualitative approach, for

two main reasons: (i) overall quantitative results provide only a general

view on the differences between translated texts and original texts, hiding

subtle but crucial variations; (ii) the use of comparable corpora does not

provide any information on the strategies used by translators and on the

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

translation process itself. The article also provides suggestions for

implications in translator training and translation quality assessment.

Keywords: corpus-based translation studies, parallel corpus, comparable

corpus, existential constructions, quality, translator training

1. Introduction

In the same vein as many case studies within the corpus-based translation

studies (CBTS) framework, a recent study, Cappelle & Loock (2013), has

provided a comparative, quantitative analysis on existential there+BE

constructions in English (1a) and their counterpart il y+AVOIR

constructions in French (1b) (see section 1.1 for a complete definition of

the object of study), both in translated and original texts. Using both self-

collected and pre-existing electronic corpora of post-1980 fiction texts,

Cappelle & Loock (2013) have shown that significant statistical

differences exist not only between French and English (inter-language

differences) as is generally claimed by many contrastive-linguistics and

translation textbooks (e.g. Ballard 2003, 2004; Chuquet & Paillard 1989;

Guillemin Flescher 1981), but also between translated English (from

French) and original English, as well as between translated French (from

English) and original French (intra-language differences).

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

(1) a. There was a dog in the garden.

b. Il y avait un chien dans le jardin.

Although the present article does not question the validity of the results

that have been obtained in this study, I claim, in line with studies like

Johansson (2007) for example, that such an overall quantitative analysis

does not suffice to provide a complete analysis, and thus aim to go further

by providing a finer-grained, qualitative analysis by answering two

important research questions concerning inter-language differences that

Cappelle & Loock's (2013) study does not address:

1. While the overall quantitative study shows that there are fewer

occurrences of existential constructions in French translated texts than

in the corresponding English original texts (see section 1 for a

summary of the results), can this be explained by the sole ‘omissions’

of existential constructions through some syntactic reorganization or

deletion? Or are those omissions actually more important numerically

while we also find the ‘addition’ of existential constructions between

English source texts and the French target texts?1 A similar question

holds for texts translated from French to English, as there are more

existential constructions in translated English than in original French.

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

2. What kind of strategies are used by translators when translating

sentences with existential constructions, from French to English but

also from English to French, when a literal translation – that is a

translation with an existential construction in the target language – is

not retained?

More generally, the article aims to show that a thorough analysis of a

specific linguistic phenomenon within the CBTS framework requires both

quantitative and qualitative analyses to uncover the specific characteristics

of translated language, and thus requires resorting to both comparable

(monolingual and bilingual) and parallel corpora. This is an approach

advocated by Johansson (2007) in particular – actually suggested as early

as the 1990s – who claims that multilingual corpora like the English-

Norwegian Parallel Corpus (ENPC) are necessary for analyses to be

comprehensive. However, many studies within the CBTS framework still

solely rely on overall quantitative analyses to establish differences

between original and translated languages (see section 4 for examples).

The article thus aims to reassert that different types of corpora are indeed

required: (a) multi-million-word bilingual comparable corpora of original

language to uncover inter-language differences; (b) multi-million-word

monolingual comparable corpora of original and translated language to

uncover significant differences between translated and original texts, thus

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

uncovering specific features of what has been called interlanguage

(Selinker 1972) or third code (Frawley 1984), and (c) bi-directional

parallel corpora, even small – large electronic corpora of translated texts

are not available for all languages yet – for a qualitative analysis such as

the one described here. As the present study discusses Cappelle & Loock’s

(2013) results obtained with a literary corpus, the parallel corpus used for

this analysis also consists of literary, post-1980 texts.

The article is organized as follows. In section 2 the results of Cappelle

& Loock’s study are summarized and its limits are shown. In section 3, I

provide the results of my complementary analysis, based on a sample of

Cappelle & Loock’s self-collected corpus of original and translated

novels: first of all, detailed results on the addition, omission and

preservation of existential constructions in French to English and English

to French translations; second, results concerning translation strategies

when a literal translation is not retained. In section 4 I discuss the results

of the analysis within the context of other studies in the CBTS field and

show how important such qualitative analyses are, and how important the

use of parallel corpora is. Finally, in a concluding section, I discuss the

implications of my findings for translator training and translation quality

assessment.

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

2. Cappelle & Loock’s quantitative analysis

2.1. Corpus material

Cappelle & Loock’s (2013) study relies on data extracted from both a self-

collected corpus and a combination of existing large electronic corpora,

one of the aims of their study being a comparison of the results obtained

with the two types of corpora. The self-collected corpus consists of 12

post-1980 novels: 3 in original French, 3 in original English, and their

translation in the other language, for a total of nearly 1 million words (see

section 3.1. for more information). The electronic corpus consists of

extracts from the British National Corpus (Davies 2004) for original

English, from Frantext2 for original French, and from the Translational

English Corpus3

for translated English. As no electronic corpus of

translated French is currently available, no electronic data was used for

translated French (see below). The compiled electronic corpus consists of

15,909,312 words for the BNC sub-corpus, 11,365,626 words for the

Frantext sub-corpus, and 995,143 words for the TEC sub-corpus (with

only texts translated from French being retained), all sub-corpora

consisting of post-1980 fiction texts to allow for relevant comparisons.

The study thus resorts both to a parallel corpus and comparable

corpora as defined for example by Johansson (2007), Granger (2010), or

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

Teubert (1996): while comparable corpora are “corpora in two or more

languages with the same or similar composition”, a parallel corpus

consists of “a bilingual or multilingual corpus that contains one set of texts

in two or more languages” (Teubert 1996: 245, 247). However, the results

retained by Cappelle & Loock for their study are those obtained from their

electronic, comparable corpora (BNC, Frantext, TEC), except for

translated French for lack of such an electronic corpus. Cappelle & Loock

thus seem to suggest that the use of a smaller, parallel corpus is not

necessary, or even cannot be trusted, for a CBTS-oriented corpus study

that aims to uncover inter- and intra-language differences (see Cappelle &

Loock 2013: 269-270).

The tertium comparationis, that is “the common ground on which two

languages can be compared to establish (dis)similarities” (Ebeling 1998:

603), retained in Cappelle & Loock’s study is existential constructions,

that is there-constructions in English and il y a-constructions in French,

considered to be a case of translational equivalence (see below). The study

only covered existential constructions with the verbs be in English and

avoir in French, thus discriminating between such constructions and

presentational constructions, which include other verbs (e.g. come, arrive,

live, remain), as the two constructions actually differ (see Birner & Ward

1998).4 In addition, the so-called prepositional use of il y a in French

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

(Grevisse 1986) to define a period of time (il y a quatre ans [‘4 years

ago’]) was also discarded.

2.2. Results

What the corpus study has shown is that both significant inter-language

(French vs. English) and intra-language (translated vs. original English;

translated vs. original French) differences exist. A comparison between

the BNC and Frantext sub-corpora shows that there is a highly significant

difference between the two languages, with 2,581 occurrences per million

words (pmw) of there-constructions in original English, as opposed to

1,079 occurrences pmw of il y a-constructions in original French (z-ratio =

87.55; p<0.0001). The same difference, although less pronounced, is

revealed by the self-collected parallel corpus.

As far as intra-language comparison is concerned, the study has

revealed a highly significant difference between original and translated

English, with 1,376 occurrences pmw of there-constructions in the TEC

sub-corpus (z-ratio = 21.92; p<0.0001). Once again, the self-collected

corpus shows the same intra-language difference, although the difference

is less pronounced. Concerning the intra-language comparison for French,

given the unavailability of electronic data for translated texts, the study

compared results obtained from Frantext with those of the self-collected

sub-corpus of translated French. The observed difference is highly

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

significant (z-ratio = 7.19; p<0.0001) with 1,535 occurrences pmw in

translated French. The results are summarized in Figure 1.

What this case study has uncovered is that existential constructions

are much more frequent in English than in French, in line with many

contrastive-linguistics and translation textbooks. It also shows that

existential constructions are less frequent in translated English than in

original English, while they are more frequent in translated French than in

original French. Such intra-language results suggest a case of source-

language interference (translationese), although an interpretation

involving translation universals as defined by Baker (1993, 1995, 1996)

could also be involved (see Cappelle & Loock (2013) for a discussion; I

shall not tackle this issue here).

Figure 1. Results from Cappelle & Loock (2013): existential constructions

in English and French, translated and original (per million words)

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

From a methodological point of view, the study has also revealed that

results obtained with a small self-collected corpus consisting of 12 novels

and results obtained with larger electronic corpora might differ, sometimes

very significantly. What Cappelle & Loock (2013) conclude about this is

that large electronic corpora provide more reliable results than a random

collection of original novels and their translated counterparts.

The corpus study thus has strong methodological implications, while

many CBTS studies actually overlook the question of the

representativeness of the corpus they use (see special issue of Across

Languages and Cultures 13(2), 2012, on this question). This naturally

casts doubt on the results for intra-language differences for French, doubt

that will only be cast off when an electronic corpus of translated French

becomes available for the community,5 but I will consider here that the

results are valid.

2.3. Limits

Although the quantitative analysis described above uncovers significant,

interesting results, it remains limited in that it does not provide the

complete picture for the numerical results and importantly, it does not

provide any information on the strategies used by translators to deal with

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

sentences with existential constructions, two results that could be of high

interest for translator training.

In particular, if Cappelle & Loock (2013) have shown that there is a

‘loss’ of existential constructions for texts translated from English into

French, they do not specify whether this loss corresponds solely to the

omission of such constructions or whether there are actually more

omissions, accompanied by a certain number of additions in translated

French. It is not because we know the frequency of existential

constructions in original English and in French translated from English,

even for the same texts, that we know for sure that the difference is only

due to a loss; it could actually correspond to a (higher) loss and a (small)

gain. The same issue holds for English translated from French, where the

use of existential constructions is higher than in original French: can this

‘gain’ be explained by additions only? Comparable corpora, whether

monolingual or bilingual, compare separate collections of texts. Although

they “represent ordinary language use in each language and should allow

safe conclusions on similarities and differences between the languages

compared” (Johansson 2007: 10), they cannot provide such information.

Moreover, comparable corpora cannot help uncover the strategies that

are used by translators when translating sentences with existential

constructions, in particular from English to French, when a literal

translation is not retained. Is the non-canonical word order that

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

characterizes an existential construction, in which the semantic subject is

postposed, while dummy there stands for the syntactic subject (see

Lambrecht 1994; Birner & Ward 1998), simply replaced with an

unmarked, SVO word order (1a’) or is/are there some specific type(s) of

syntactic reorganization that translators resort to on a systematic basis?

(1) a’. A dog was in the garden.

These two limits are to be tackled in this article thanks to an analysis of a

parallel corpus that provides source texts and target texts for English and

for French. I consider such an analysis to be qualitative as it goes beyond

overall results by providing a finer-grained description of the phenomenon

under study. The aim here is to show that even when such a corpus is

modest in size, it can provide valuable information that tempers overall

results obtained from multi-million-word comparable corpora, information

that can then be used for translator training and translation quality

assessment.

3. A qualitative analysis

3.1. Corpus material and data extraction

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

The dataset analyzed here is a sub-corpus extracted from the self-collected

corpus used in Cappelle & Loock’s study. This analysis allows for a direct

comparison between the quantitative analysis and a qualitative analysis, as

the aim is to show that the latter complements the former. Cappelle &

Loock’s (2013) corpus also seems particularly relevant in that regardless

of the novel being considered or the direction of translation, the frequency

of existential constructions is systematically lower for the novels in

French than for the novels in English (see Cappelle & Loock 2013, Figure

7).

The complete corpus is described in Table 1, taken from Cappelle &

Loock’s article.

Title and author/translator Reference Language Type Year of publication

Number of words

The Gun Seller, H. Laurie HLNT English original 1996 106,866

How to Be Good, N. Hornby NHNT English original 2001 83,843 Harry Potter and the

Philosopher’s Stone, J.K.

Rowling

JKRNT English original 1997 78,546

Tout est sous contrôle, J.-L. Piningre

HLT French translated 2009 105,894

La bonté : Mode d’emploi, I.

Chapman

NHT French translated 2003 85,608*

Harry Potter à l’école des

sorciers, J.-F. Ménard

JKRT French translated 1998 85,240

Empire of the Ants, M. Rocques BWT English translated 1996 96,162* Windows on the World, F.

Wynne

FBT English translated 2004 63,135*

If Only It Were True, J. Leggatt MLT English translated 2005 65,157*

Les Fourmis, B. Werber BWNT French original 1997 89,988 Windows on the World, F.

Beigbeder

FBNT French original 2003 65,749

Et si c’était vrai, M. Lévy MLNT French original 2000 62,898*

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

Table 1. List of translated and original novels in the English-French self-

collected corpus (* = manual word count based on 20 pages chosen

randomly – electronic texts were used only when available)

The dataset used here is based on extracts from the 12 novels used in the

quantitative analysis. Specifically, I collected manually the first 25

occurrences of the existential construction in each of the 6 original novels

and in each of the 6 translated novels. The qualitative analysis thus rests

on a first sample of 300 existential constructions. To this were added the

translations of the original existential constructions (25x6, that is 150

translated sentences) as well as the original sentences that resulted in the

use of an existential construction in the translated texts (25x6, that is

another 150 original sentences). The dataset is summarized in Table 2.

Although this may seem a small number of tokens, I claim that such a

modest dataset suffices to show that the overall results must be tempered

and can provide valuable information in terms of translation strategies.

Sample A

A1 The first 25 existential constructions in each original novel

(HLNT, NHNT, JKNT, BWNT, FBNT, MLNT)

150 tokens

A2 The corresponding 25 translations of the data in A1 (HLT, NHT,

JKT, BWT, FBT, MLT)

150 tokens

Sample B

B1 The first 25 existential constructions in each translated novel

(HLT, NHT, JKT, BWT, FBT, MLT)

150 tokens

B2 The counterpart of the data in B1 in original texts (HLNT,

NHNT, JKNT, BWNT, FBNT, MLNT)

150 tokens

Table 2. Dataset used for the qualitative analysis

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

If in total the dataset amounts theoretically to 600 tokens, some of them –

but not all of them – were actually identical. For instance, in case of a

literal translation as in (2), the original sentence and the translated

sentence were listed twice (e.g. (2a) is listed both in A1 and B2 sub-

corpora; (2b) is listed both in A2 and B1 sub-corpora), but this is exactly

what the qualitative analysis is about: check in detail, in a parallel corpus,

how many existential constructions are preserved, gained or lost in the

translation process. As a counter-example, (3a) and (3b) are listed only

once.

(2) a. And the reason I had to drop it was because there was a

dying man in the room. (HLNT)

b. Parce qu’il y avait un moribond dans la pièce. (HLT)

‘Because there was a dying man in the room’

(3) a. There was no point in worrying Mrs. Dursley. (JKNT)

b. Il était inutile d'inquiéter Mrs Dursley pour si peu. (JKT)

‘It was useless to worry Mrs Dursley for so little’

3.2. Method

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

The analysis took place in two stages. First I listed the first 25 existential

constructions in each of the 6 original novels (sample A1) and checked

how many of them were translated literally in the target language, that is,

by their direct counterpart in the other language (there+BE by il y+AVOIR

and vice-versa) in the other language. This made it possible to determine

the level of mutual translatability for the structure, as defined by

Altenberg (1999).6 I also listed the first 25 existential constructions in

each of the 6 translated novels (sample B1) and checked how many of

them were the results of a literal translation of a sentence with an

existential there- or il y a-construction. This enabled me to check, for each

translation direction, from English to French and from French to English,

what proportion of existential constructions was omitted, added, or

preserved after the translation process. Such results cannot be obtained

when exploiting comparable corpora. In spite of the relatively modest size

of the dataset and the small number of novels,7 such results show the non-

systematicity of the general patterns observed in Cappelle & Loock

(2013).

The second stage consisted in tagging each translated sentence in A2

to uncover some regular strategies used by translators, something which

once again cannot be covered by the use of a comparable corpus. This

type of analysis is in line with e.g. Vinay & Darbelnet (1958/1995),

Ballard (2003, 2004), Chuquet & Paillard (1989), Guillemin-Flescher

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

(1981), which aim at providing translators with guidelines resting on

actual language use. Such results are meant to have a pedagogical value,

uncovering some usage constraints that need to be dealt with by

translators if they want to provide natural-sounding translations. This type

of analysis is also in line with Ebeling (1998, 2000), who stresses the

importance of parallel corpora to provide valuable information on

differences and similarities between languages, as well as on translation

behavior.

3.3. Results

3.3.1. Stage 1 results

The results, which are summed up in Table 3, shed new light on those

obtained in Cappelle & Loock (2013). Overall, the results reveal low

mutual translatability (Altenberg 1999; see note 6) of 46% for the whole

dataset, which clearly shows that the two structures are not completely

translationally equivalent, as was already revealed by the significant inter-

language and intra-language differences.

Let us first consider English to French translations, a translation

direction for which we expect a certain loss of existential constructions if

we consider the data from the quantitative analysis based on the

comparable corpora. What can be noticed is that, in addition to the

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

omission of 48 (out of the 75) occurrences of existential constructions,

there is a significant proportion of preserved existential constructions (27),

and above all, quite a significant number of additions (42 occurrences for

the three novels). While the proportion of omissions vs. preservations is

predictable (only 36% of there-sentences are translated with il y a-

sentences), the last result, namely the proportion of additions, is quite

unexpected, as it goes against the general trend uncovered by Cappelle &

Loock (2013), which showed that existential constructions are fewer in

number in French, whether translated or original. Examples listed in (4),

(5), and (6) show an instance of each case (preservation, omission and

addition, respectively):

(4) a. And I make some weak what-kind-of-girl-do-you-think-I-am

joke, but of course there's nothing much to joke about, really.

(NHNT)

b. Et moi je riposte mollement par une vanne du style “Vous me

prenez pour qui?”, mais bien sûr, il n'y a pas de quoi rire.

(NHT)

‘And I gently respond with a joke like "Who do you think I

am?", but of course, there is nothing to laugh about’

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

(5) a. At the prices they were charging for liquor, there were

probably only a couple of dozen people in the world who

could afford to stick around for a second drink. (HLNT)

b. Vu le prix des boissons alcoolisées, à peine plus d’une

vingtaine de personnes en ce bas monde avaient les moyens

de commander un deuxième verre ici. (HLT)

‘Given the prices of alcoholic beverages, hardly about twenty

people in the world could afford ordering a second drink

here’

(6) a. "Mr Woolf," I said, "before you name a place, make sure you

can book it for at least ten people". (HLNT)

b. Monsieur Woolf, avant de me proposer un restaurant,

assurez-vous qu’il y a de la place pour une dizaine de

personnes environ. (HLT)

‘Mr. Woolf, before you suggest a restaurant, make sure that

there is enough room for about ten people’

Are in the source text

and are preserved in

the target text

(PRESERVATION)

Are in the source text

but are not kept in

the target text

(OMISSION)

Are not in the source

text but are added in

the target text

(ADDITION)

English >

French HL 6 19 13

JKR 11 14 16

NH 10 15 13

Total En>Fr 27 (36%) 48 (64%) 42

French > ML 12 13 15

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

Table 3. General translation strategies concerning source and target

existential constructions

As far as French to English translations are concerned, what we notice is

that a higher proportion of existential constructions are preserved (42

occurrences out of 75, that is 56%), which is in line with Cappelle &

Loock’s results: a gain is to be expected when the target language is

English and the source language is French. But the expected gain is

strengthened by the addition of a high number of there-constructions (34

of the first 75 there-constructions in the translated novels were not in the

original texts). On the other hand, we also notice quite a significant

proportion of omissions (33 occurrences), which means that the (minimal)

gain hides a certain number of omissions. Once again, the picture

provided by the quantitative analysis of comparable corpora is actually

incomplete.

These results are discussed more thoroughly in section 4, but what

they inevitably show is that the differences uncovered through the use of

comparable corpora cannot provide the full picture on the distribution of a

specific construction in translated and original texts. Even the analysis of a

small parallel dataset shows that the general patterns observed in Cappelle

English

BW 15 10 11

FB 15 10 8

Total Fr>En 42 (56%) 33 (44%) 34

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

& Loock (2013) are not systematic. A finer-grained approach with a bi-

directional parallel corpus is therefore necessary.

3.3.2. Stage 2 results

The second aim of the present study here is to uncover strategies used by

translators when a literal translation of sentences with existential

constructions is not retained. What is meant by literal translation is the use

of a there-construction to translate a sentence with an il y a-construction,

and vice-versa. Such a definition of ‘literal translation’ is legitimized by

the fact that the il y a-construction covers “much the same range of

functions as English existential there” (Bergen & Plauché 2005: 23) and

that the direct equivalents of English existential there-sentences are il y a-

constructions in French (cf. e.g. Lambrecht 1994: 178). The two

constructions are thus considered to be translationally equivalent as they

“convey the same ideational and interpersonal and textual meanings”

(James 1980: 178), providing a relevant tertium comparationis. This

means that the use of other existential constructions (see below) is not

considered to be a literal translation as the translator does not resort to a

direct equivalent, but to a translation shift as defined by Catford (1965):

“departure from formal correspondence in the process of going from SL to

TL” (Catford 1965: 73).

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

The uncovering of translators’ strategies is important for two reasons.

First of all, it is helpful in defining linguistic differences between the two

languages, which offer different strategies in addition to there-/il y a-

constructions to introduce new referents in the discourse. Second, it can

have a pedagogical interest by providing translation students with

guidelines that allow them to respect inter-language differences (ceteris

paribus, speakers of English resort more to existential constructions than

speakers of French) and to use other constructions to convey the same

logico-semantic meaning. This second aspect is in line with pedagogical

approaches like Vinay & Darbelnet (1958/1995), Chuquet & Paillard

(1989), Guillemin-Flescher (1981), Ballard (2003, 2004), which all

provide detailed descriptions of strategies/procedures used by translators

when resorting to non-literal translations, as well as generalizations as to

why and how these should take place.

In order to achieve this goal, I tagged the 150 sentences in A2

translated from sentences with there- and il y a-constructions depending

on the choice made by the translators of the 6 novels. I divided the

translations into 5 groups: (i) literal translation (see (4) above); (ii)

syntactic reorganizations (to be divided into different types); (iii) deletion;

(iv) idiomatic expressions; (v) non-translation.

The general results are provided in Table 4. The different strategies

used by translators are detailed in the remaining of this section. Note that

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

these results are not meant to be representative of all translated literary

texts – the parallel corpus is far too small for that. The aim of the analysis

here is to uncover some strategies used by translators to deal with

existential constructions.

Table 4. Specific translation strategies used to translate existential there-/il

y a-constructions

3.3.2.1. Syntactic reorganizations. A syntactic reorganization consists in a

modification of the syntactic organization of the source sentence, entailing

a change of viewpoint or perspective (Vinay & Darbelnet 1958/1995).

This translation procedure may concern the translation of an active

sentence by a passive sentence, a change in constituent order or a change

in the starting term/topic of the sentence. As regards the translation of

existential there-/il y a-constructions, three main recurring types of

syntactic reorganizations have been observed:

Literal

translation

Syntactic

reorganizati

ons

Deletion Idiomatic

expressions

Non

translated Total

NH 10 11 4 0 0 25

JKR 11 9 5 0 0 25

HL 6 15 4 0 0 25

Total Eng >

Fr 27 35 13 0 0 75

FB 15 8 2 0 0 25

BW 15 7 2 1 0 25

ML 12 7 2 3 1 25

Total Fr >

Eng 42 22 6 4 1 75

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

(i) the use of another existential construction such as an impersonal

subject followed by the verbs avoir or have (7); the use of a

presentational there-construction, involving a more specific verb than

be/avoir (8), the use of a perception verb with an animate subject to

establish the (in)existence of an entity (9), or the use of the verb avoir

with an inanimate object (10);

(ii) the insertion of a verb semantically associated with the notional

subject in the original existential construction, now the syntactic

subject in the translated sentence (11-13);

(iii) an inversion, whether locative or subject-verb inversion (14-15).

(7) a. I said drop it. There’s a guy dying in here. (HLNT)

b. J’ai dit : assez ! On a un mec en train de crever, là ! (HLT)

‘I said: enough! We have a guy dying, here’

(8) a. There is a certain group of people who will respond to one of

the most basic and pertinent of questions with a mild and

impatient blasphemy. (NHNT)

b. Il existe une certaine catégorie de gens qui répondent aux

questions les plus élémentaires et les plus pertinentes par ce

genre de blasphème bénin et impatient. (NHT)

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

‘There exists a certain category of people who answer the

most basic and pertinent questions with that kind of mild and

impatient blasphemy’

(9) a. Il n’y a pas des caméras et des micros partout ! (MLNT)

‘There aren’t any cameras or microphones everywhere’

b. Can you see any cameras or microphones? (MLT)

(10) a. Il y a plus de mille restaurants en ville. (HLNT)

‘There are more than a thousand restaurants in the city’

b. This city has more than a thousand restaurants. (HLT)

(11) a. There was a horrible smell in the kitchen the next morning

when Harry went in for breakfast. (JKRNT)

b. Le lendemain matin, au petit déjeuner, une odeur

pestilentielle se dégageait d'une grande bassine posée dans

l'évier de la cuisine. (JKRT)

‘The next morning, at breakfast, a horrible smell was coming

from a big bowl in the kitchen sink’

(12) a. The lights were out, but the curtains were wide open and

there was plenty of light coming in from the street. (HLNT)

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

b. Les lampes étaient éteintes mais, avec les rideaux ouverts, les

lumières de la rue éclairaient suffisamment la scène. (HLT)

‘The lights were out but, with the open curtains, the lights

from the street were illuminating sufficiently the scene’

(13) a. Inside there was a wall three feet away. (HLNT)

b. Un mur barrait l’intérieur à moins d’un mètre de la porte.

(HLT)

‘A wall closed off the inside less than one meter away from

the door’

(14) a. Cachée entre les cintres, il y avait une femme, les yeux clos,

apparemment envoûtée par le rythme de la chanson, faisant

claquer son pouce contre son index, elle fredonnait. (MLNT)

‘Hidden between the hangers, there was a woman, eyes

closed, apparently transported by the rhythm of the song,

snapping her thumb and forefinger together, she was

humming’

b. Huddled on the floor beneath the hangers sat a young

woman, eyes closed, seemingly transported by the rhythm of

the song, humming along to it and snapping her fingers.

(MLT)

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

(15) a. There was one of those pauses that you know is going to be

long as soon as it starts. (HLNT)

b. A suivi un de ces silences dont on anticipe tout de suite la

longueur. (HLT)

‘Followed one of those silences whose length you

immediately predict’

3.3.2.2. Deletion. Deletion, which is not listed as a translation procedure

by Vinay & Darbelnet but has been listed as such by many other

researchers (e.g. Ballard 2003, 2004; Schjoldager 2008), can be defined as

the non-rendering of a source-text item in the target text. Examples (16)-

(19) illustrate such a strategy. Note that deletion and non-translation are

not the same phenomenon (see 3.3.2.4).

(16) a. Papa, il y a un monsieur qui est venu tout à l'heure pour

relier un livre. (BWNT)

‘Dad, there is a man who came earlier to bind a book’

b. Dad, a man came while you were out. It was something about

binding a book. (BWT)

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

(17) a. 'I mean,' she said, 'there’s no ambulance coming here. Jesus.'

(HLNT)

b. Je veux dire qu’aucune ambulance n’arrivera ici. Mon

Dieu ! (HLT)

‘I mean that no ambulance is going to come here. My God!’

(18) a. When Mr. and Mrs. Dursley woke up on the dull, gray

Tuesday our story starts, there was nothing about the cloudy

sky outside to suggest that strange and mysterious things

would soon be happening all over the country. (JKRNT)

b. Lorsque Mr et Mrs Dursley s'éveillèrent, au matin du mardi

où commence cette histoire, il faisait gris et triste et rien dans

le ciel nuageux ne laissait prévoir que des choses étranges et

mystérieuses allaient bientôt se produire dans tout le pays.

(JKRT)

‘When Mr. and Mrs. Dursley woke up in the morning of the

Tuesday when this story begins, the weather was grey and

miserable and nothing in the cloudy sky suggested that strange

and mysterious things were about to happen all over the

country’

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

(19) a. J'ai lu dans un article qu'en cuisine, il y avait deux frères

inséparables qui travaillaient ensemble et nettoyaient les

crustacés, côte à côte. (FBNT)

‘I read in an article that in the kitchen, there were two

inseparable brothers who worked together and cleaned

shellfish, side by side’

b. In a magazine article, I read that two inseparable brothers

worked side by side cleaning shellfish in the kitchens. (FBT)

In each case, the existential construction is not rendered as such in the

target language. The non-canonical word order is transformed into a

canonical, unmarked, SVO word order. The notional subject is now the

syntactic subject of the sentence, but contrary to examples of type (11-13)

no specific verb semantically associated with the subject is added, as a

verb was already present in the source sentences within an embedded

clause post-modifying the notional subject (relative clause, participle

clause, or infinitive clause).

3.3.2.3. Idiomatic expressions. Idiomatic expressions can be defined as

“complex bits of frozen syntax, whose meanings are more than simply the

sum of their individual parts” (Nattinger & De Carrico 1992: 32). They

represent prefabricated strings of words with fixed meanings, and

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

generally are not subject to the principle of compositionality. They are

often difficult to translate and are to be treated as one single unit of

meaning whose counterpart in the target-language is generally

syntactically and semantically different from the idiom in the source

language. Existential constructions can be found in idiomatic expressions,

as in (20) or (21), where the il y a-construction serves its traditional

discourse function (see above) but is part of a fixed phrase that cannot be

translated literally.

(20) a. Qu'est-ce qu'il y a ? (BWNT)

‘What is it that there is?’

b. What's the matter? (BWT)

(21) a. Avoue-le, bon sang, que tu es sorti avec elle, puisque cela

fait vingt ans comme tu le dis, il y a prescription maintenant !

(MLNT)

‘Admit it, for God’s sake, that you went out with her, since

it’s been 20 years as you say, there is ‘prescription’ now!

b. Admit it, for God's sake, you went out with her! It's been

fifteen years, like you said – the statute of limitations has

expired! (MLT)

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

There is no real strategy here from the part of the translators, who have no

choice but to find a pragmatic equivalent in the target language,

independently of the syntax and semantics of the source phrase.

3.3.2.4. Non-translated. As usual, when one compares source texts and

target texts, one realizes that for some reason, translators have decided not

to translate parts of sentences. This is only rarely the case in the dataset,

which contains solely one occurrence of such a decision, where the part of

the sentence that contains the existential construction is not translated

(22). I do not consider this to be a translation strategy but mention the

example to illustrate this possibility when investigating parallel corpora. It

should be noted that this case is different from deletion, as the translator

does not even preserve the logico-semantic meaning of the part of the

sentence with the existential construction.

(22) a. Elle ajouta que c’était facile, il n’y avait que des cinq.

(MLNT)

‘And she added that it was easy, there were only fives’

b. Easy to remember. (MLT)

4. Discussion of the results

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

The answers to the two research questions that were formulated in the

introduction are thus the following:

1. The lower number of occurrences of existential constructions in

French translated texts as opposed to English original texts cannot be

explained by the sole omission of a certain number of existential

constructions in the translation process. The shifts (mostly syntactic

reorganization and deletion) are actually more important in number,

but are accompanied by the addition of il y a-constructions that do not

correspond to there-constructions in the source texts. In the same

way, the higher frequency of existential constructions in English

translated texts as opposed to French original texts actually

corresponds to a certain number of additions, but accompanied by a

significant number of omissions.

2. The strategies used by translators in case of non-literal translation (at

least those uncovered by the analysis of the dataset) are quite

restricted in number, with only two major types of procedures:

syntactic reorganization (use of another existential construction or of

a presentational construction; insertion of a verb with the notional

subject used as syntactic subject; inversion) and deletion, while

idiomatic expressions represent a non-strategy as translators are given

no choice but to find the equivalent in the target language.

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

These answers stem from the complementing of an overall, quantitative,

analysis based on multi-million word corpora with a finer-grained,

qualitative analysis of a small parallel dataset. My aim was to show that

the results provided in Capelle & Loock (2013) were actually incomplete

and, in line with approaches like Johansson's (2007), to show that the

combination of comparable and parallel corpora is necessary for a

thorough cross-linguistic analysis of a specific linguistic phenomenon (see

also Altenberg & Granger (2002) or Granger (2003) for a discussion on

the advantages and disadvantages of the different types of corpora).

Ever since the advent of corpus-based approaches to translation

phenomena in the 1990s, overall quantitative approaches have regularly

been criticized and researchers have been warned against a sheer

numerical analysis such as those performed in contrastive linguistics

studies. For instance, Mason (2001) claims that such approaches have

important limitations in spite of their immense potential for the discipline

(Mason 2001: 78). According to him, while the quantitative study of

source texts in each language can provide information about norms of

language behavior, we cannot make do without studying source texts and

their translations to collect information on translation behavior if we want

to avoid vague generalizations. Mason clearly advocates a combination of

quantitative and qualitative analyses, and cautions against putting too

much emphasis on the scientific, quantitative nature of CBTS. Such

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

criticism is in line with Tymoczko (1998), who also warns against the

subjectivity present in numerical approaches, which can never be

completely objective: according to her, the research questions that are

formulated by researchers determine the results, and researchers should

not exploit corpora “merely to prove the obvious or give confirming

quantification where none is really needed, in short to engage in the type

of exercise that after much expense of time and money ascertains what

common sense knew anyway” (Tymoczko 1998: 657-658).

Although I agree with Tymoczko when she warns researchers against

regarding as sacred quantitative results obtained in corpus-based

translation studies, it is hard to agree with her when she claims that

corpora should not be used in order to “prove the obvious”. The obvious is

actually sometimes the hardest to prove scientifically (one just needs to

think about the fact that it was obvious that apples but not the Moon fell

onto the ground before the difference was explained by Newton’s

Universal Law of Gravitation).

In translation studies, research before the advent of corpus-based

studies in the 1990s and 2000s often relied on intuition – which itself

relied on experience – to try and define differences between languages

(see for instance Vinay & Darbelnet (1958/1995) as typical of such an

approach to English and French). Although such approaches have drawn

an impressive list of differences between languages, suggesting some

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

translation strategies to take them into account, we now know that some

differences, in Vinay & Darbelnet for instance, were over-generalized like

the (non-)use of connectors in French translated from English for instance

(see Mason 2001).

Mason’s and Tymoczko’s objections to a sheer numerical approach to

translated texts are in direct correlation with criticism on the type of

corpora that are to be used in CBTS. The sole use of monolingual

comparable corpora like the BNC and the TEC for English has been

criticized, for instance, as they cannot provide information about the

translation process itself and cannot shed light on translation possibilities

or idiosyncrasies. Caution is thus expressed by Malmkjaer (1998) for

example, for whom the use of small bilingual corpora allows for the

uncovering of information that the use of comparable corpora does not

provide in spite of their many advantages.

Finally, the exploitation of quantitative results to prove the existence

of translation universals as defined by Baker has been severely criticized

these recent years. Beyond criticism regarding the existence of the

universals themselves (see e.g. House 2008; Corpas Pastor et al. 2008;

Mauranen & Kujamäki 2004), some researchers claim that translation

universals, if they exist, cannot be confirmed by the sole exploitation of

general numerical differences between translated and original texts (e.g.

Becher 2010). De Sutter, Delaere & Plevoets (2012) in particular show

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

that “the relationship between translated texts and original texts is not

mono-, but multidimensional in nature, like most other linguistic

products” (De Sutter, Delaere & Plevoets 2012: 327).

Such criticism is taken into consideration by some researchers who

combine both a qualitative and a quantitative analysis of their data, with

some very interesting results. One must note that as early as the 1990s,

approaches like the ones advocated by Johansson (2007) or Ebeling

(1998) for instance, recommended the use of multilingual corpora so as to

combine comparable and parallel corpora, with a view to not only

determining inter- and intra-language differences, but also to accounting

for the observed differences as well as translators’ strategies. Many

studies now combine the exploitation of both types of corpora, thus

providing more detailed information on their object of study. A typical

example of such analyses is the one performed by Ramón & Labrador

(2008), who provided an analysis on the translation of -ly adverbs of

degree from English to Spanish, which can be translated literally with -

mente adverbs but also non-literally with a prepositional phrase or an

adjective phrase for instance. By using first a comparable monolingual

corpus to compare the use of -mente adverbs in original and translated

texts and then a parallel corpus to investigate the translational options

retained by translators (literal translation with a -mente adverb,

prepositional phrase, adjective, etc.), Ramón & Labrador provide

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

significant results and reveal cross-linguistic differences that can be

exploited for translator training and translation quality assessment. For

instance, translators have a tendency to overuse -mente adverbs when

translating -ly adverbs but in varying proportions depending on the type of

text (fiction vs. non-fiction), on the negative quality of -ly degree adverbs

(e.g. awfully, badly, poorly), or on the type of degree expressed by the

adverb (Ramón & Labrador noticed an overuse of literal translation for

absolute or very high degree adverbs like absolutely or extremely). They

also noticed that omission often occurred (8% of cases in their corpus),

and the finer-grained observation of their parallel corpus allowed them to

establish when such a translation strategy is likely to occur.

Nevertheless, such an approach is not the rule in the CBTS field. For

instance, general studies on differences between translated texts and

original texts such as Xiao (2010) on Chinese and Laviosa (1997; 1998)

on English, which both focus on the lexical features of translated texts

(lexical density; choice of high-frequency words), while providing

instructive, crucial information on general intra-language differences, do

not provide any specific information about the systematicity or non-

systematicity of the general patterns observed. This stems from the fact

that these studies rely on comparable monolingual corpora, namely TEC

vs. BNC for Laviosa (1997, 1998), ZJU Corpus of Translational Chinese

vs. Lancaster Corpus of Mandarin Chinese for Xiao (2010), and their aim

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

is to investigate such general intra-language differences, which are

interpreted in terms of translation universals. Their results, though highly

valuable, cannot be fully exploited for translator training and translation

quality assessment.

In the same way, the kind of qualitative approach adopted in Olohan’s

(2003) study on contractions in translated and original English is not

sufficient, despite the fact that it goes beyond an overall intra-language

comparison by investigating the influence of linguistic context on the

characteristics of translated texts. The study does provide some kind of

qualitative analysis by showing that the initial quantitative analysis hid

important variations in the use of contractions and that the co-text and the

discourse functions of contracted forms are important in that they both

have an influence on the (non-)use of contractions in texts. However, it

does not provide the full picture as it relies on a comparable corpus only.

This is of course irrelevant for Olohan’s study itself on contracted forms,

but what such an approach cannot tell us, in spite of its irrefutable

qualitative aspect, is whether there are cases that go against the general

trend. This could be the case if a full form of a source language where

contractions are used would be translated with a contracted form.

Contractions are naturally an irrelevant tertium comparationis for such an

analysis, but the present study on existential constructions, which exist in

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

the two languages under consideration, has clearly shown that the general

results hid a significant number of counter-examples to the general trends.

With this article the aim was therefore to reaffirm that studies that

combine the exploitation of different types of corpora bring more

information than general numerical approaches that fail to provide the

whole picture to define precisely the characteristics of the third code. I

have shown that a finer-grained approach thanks to a parallel corpus, even

a small one, helps to (i) uncover the (non-)systematicity of the observed

general patterns and (ii) provide information about translators’ strategies.

5. Concluding remarks and future research

The article complements Cappelle & Loock’s (2013) study by providing a

finer-grained approach to the translation of existential constructions from

English to French and from French to English. I have shown that overall

results can hide more subtle translation strategies than what a general

approach based on comparable corpora can suggest. This study has also

provided information on the ways existential constructions can be

translated when a literal translation is not to be retained.

The next step now is to try and determine when each strategy is used

(correlation with linguistic context and/or pragmatic function): when

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

translators decide not to translate an existential construction literally, is

this due to a specific syntactic reorganization (e.g. a syntactically heavy

NP following there+BE or il y+AVOIR in the source-text, or a lexical gap

in the target language) or is this due to some more general usage

constraint? What are the linguistic factors that guide translator’s choices

between the different strategies at their disposal? Is there a link with the

different subtypes of existential constructions (see e.g. Huddleston &

Pullum’s (2002) distinction between ‘bare’ and ‘extended’ existential

constructions, depending on the presence/absence of an adverbial

extension or a modifier after the notional subject in the sentence)? It

would also be interesting to compare the results with results for other

genres, such as press articles: do we find similar inter-/intra-language

differences? Do translators use the same strategies when translating

existential constructions? The first question would be particularly relevant

if one wants to determine the role of translation universals, which by

definition are not supposed to be genre-dependent.

The finer-grained results discussed are important if we consider that such

case studies can provide crucial information for both translator training

and translation quality assessment. Although there is no consensus on this

idea among researchers and translation trainers, in particular for literary

translation, results obtained within the field of corpus-based translation

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

studies can have powerful pedagogical value. While there is general

consensus on the fact that translated language differs from original

language in a number of ways, researchers still disagree on the

interpretation of these results: are they in some way a natural, translation-

inherent, phenomenon, revealing translation universals which, as such,

cannot be avoided, or should we say that translated texts that differ from

original texts in a specific language are translations that can be improved?

In other words, should we consider translated language as variation

comparable to dialectal variation or should we consider that the over-

representation or under-representation of a given linguistic construction,

say existential constructions, means that the quality of the translation

should be improved? From an even more general perspective, should we

consider that translated language is intrinsically different and represents

what researchers have called a third code (but it’s okay, though, it’s still

the same language) or should we consider that “the utopian goal is to

make it virtually impossible to tell the translation from an original text in

that language” (Teubert 1996: 241)? If we consider the second option,

then the results of this case study on existential constructions for instance

can be used for translator training. While students should be aware of

inter-language differences between English and French and be aware of

translation procedures other than literal translations that can be used to

translate existential constructions, they should also be warned that

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

systematically adding existential constructions when translating from

French to English or omitting them when translating from English to

French is a mistake.

Evaluating the quality of translations, which is probably the most

debated question concerning translation (see Depraetere 2011 or Loock et

al. 2013 for recent experiments), could also be achieved objectively

through the checking of the number of existential constructions in the

translations to determine whether they are under-/over-represented or not

in relation to original language. All these questions related to translator

training and quality assessment are left open for future research within the

CorTEx research project (Corpus, Translation, Exploration) at the

University of Lille.

Acknowledgements

I would like to thank first of all Bert Cappelle, with whom I conducted the

quantitative study without which this additional analysis could not have

been achieved. I also thank the participants in the CorTEx project (Gert

De Sutter, Michaël Mariaule, and Cindy Lefebvre-Scodeller in particular)

for their comments and suggestions on the study. Finally, I thank

anonymous reviewers for comments and suggestions that greatly

contributed to improving the quality of this paper. Any remaining errors

are naturally mine.

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

References

Altenberg, B. 1999. “Adverbial connectors in English and Swedish:

Semantic and lexical correspondences”. In H. Hasselgård &

S. Oksefjell (eds), Out of Corpora. Studies in Honour of Stig

Johansson, 249-268. Amsterdam: Rodopi.

Altenberg, B. & Granger, S. 2002. “Recent trends in cross-linguistic

lexical studies”. In B. Altenberg & S. Granger (eds), Lexis in

Contrast. Corpus-based Approaches, 3-48. Amsterdam-Philadelphia:

John Benjamins.

Baker, M. 1993. “Corpus linguistics and translation studies: Implications

and applications”. In M. Baker et al. (eds), Text and Technology, 233-

250. Amsterdam-Philadelphia: John Benjamins.

Baker, M. 1995. “Corpora in Translation Studies: An overview and some

suggestions for future research”. Target 7(2): 223-243.

Baker, M. 1996. “Corpus-based translation studies: The challenges that lie

ahead”. In H. Somers (ed.), Terminology, LSP and Translation.

Studies in Language Engineering in Honour of Juan C. Sager, 175-

186. Amsterdam-Philadelphia: John Benjamins.

Ballard, M. 2003. Versus: repérages et paramètres, vol. 1. Paris: Ophrys.

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

Ballard, M. 2004. Versus: des signes au texte, vol. 2. Paris: Ophrys.

Becher, V. 2010. “Abandoning the notion of 'translation-inherent'

explicitation: Against a dogma of translation studies”. Across

Languages and Cultures 11(1): 1-28.

Bergen, B.K. & Plauché, M.C. 2005. “The convergent evolution of radial

constructions: French and English deictics and existentials”.

Cognitive Lingusitics 16(1): 1-42.

Birner, B. J. & Ward, G. 1998. Information Status and Noncanonical

Word Order. Amsterdam-Philadelphia: John Benjamins.

Cappelle, B. & Loock, R. 2013. “Is there interference of usage

constraints? A frequency study of existential there is and its French

equivalent il y a in translated vs. non-translated texts”. Target 25(2):

252-275.

Chuquet, H. & Paillard, M. 1987. Approche linguistique des problèmes de

traduction anglais-français. Paris: Ophrys.

Corpas Pastor, G., Mitkov, R., Afzal, N., & Pekar, V. 2008. “Translation

universals: do they exist? A corpus-based NLP study of convergence

and simplification”. In Proceedings of the Eighth Conference of the

Association for Machine Translation in the Americas (AMTA-08).

Waikiki: Hawaii.

Davies, M. 2004. BYU-BNC: The British National Corpus. Available at

http://corpus.byu.edu/bnc/.

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

De Sutter, G., Delaere, I. & Plevoets, K. 2012. “Lexical lectometry in

corpus-based translation studies: Combining profile-based

correspondence analysis and logistic regression modelling”. In M.P.

Oakes & J. Meng (eds), Quantitative Methods in Corpus-based

Translation Studies. A Practical Guide to Descriptive Translation

Research, 325-346. Amsterdam-Philadelphia: John Benjamins.

De Sutter, G., Goethals, P., Leuschner, T. & Vandepitte, S. 2012.

“Towards methodologically more rigorous corpus-based translation

studies”. Across Languages and Cultures 13(2): 137-143.

Depraetere, I. (ed.). 2011. Perspectives on Translation Quality. Berlin-

Boston: Walter de Gruyter.

Ebeling, J. 1998. “Contrastive linguistics, translation, and parallel

corpora”. Meta 43(4): 602-615.

Ebeling, J. 2000. “Using translations to explore construction meaning in

English and Norwegian”. In S. Johansson & S. Oksefjell (eds),

Corpora and Cross-linguistic Research: Theory, Method and Case

Studies, 169-195. Amsterdam-Atlanta: Rodopi.

Frawley, W.1984. “Prolegomenon to a theory of translation”. In

W. Frawley (ed.), Translation: Literary, Linguistic and Philosophical

Perspectives, 159-175. Newark: University of Delaware Press.

Reprinted in L. Venuti (ed.) 2000. The Translation Studies Reader,

250-263. London-New York: Routledge.

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

Granger, S. 2003. “The corpus approach: A common way forward for

contrastive linguistics and translation studies?” In S. Granger, J. Lerot

& S. Petch-Tyson (eds), Corpus-based Approaches to Contrastive

Linguistics and Translation Studies, 17-29. Amsterdam-New York:

Rodopi.

Granger, S. 2010. “Comparable and translation corpora in cross-linguistic

research: Design analysis and applications”. Journal of Shanghai

Jiaotong University 2: 14-21.

Grevisse, M. 1986. Le bon usage, 12th

edition by André Goosse. Paris:

Duculot.

Guillemin-Flescher, J. 1986. Syntaxe comparée du français et de

l’anglais. Problèmes de traduction. Gap-Paris: Ophrys.

House, J. 2008. “Beyond intervention: Universals in translation”. Trans-

kom1: 6-19.

Huddleston, R. & Pullum, G. 2002. The Cambridge Grammar of the

English Language. Cambridge: Cambridge University Press.

Izquierdo, M., Hofland, K. & Reigem, Ø. 2008. “The ACTRES parallel

corpus: An English-Spanish translation corpus”. Corpora 3(1): 31-41.

Johansson, S. 2007. Seeing through Multilingual Corpora. On the Use of

Corpora in Contrastive Studies. Amsterdam-Philadelphia: John

Benjamins.

James, C. 1980. Contrastive Analysis. London: Longman.

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

Lambrecht, K. 1994. Information Structure and Sentence Form.

Cambridge: Cambridge University Press.

Laviosa, S. 1998. “Core patterns of lexical use in a comparable corpus of

English narrative prose”. Meta 43(4): 557-570.

Laviosa-Braithwaite, S. 1997. “Investigating simplification in an English

comparable corpus of newspaper articles”. In K. Klaudy & J. Kohn

(eds), Transferre Necesse Est. Proceedings of the 2nd International

Conference on Current Trends in Studies of Translation and

Interpreting, 5-7 September 1996, Budapest, Hungary, 531-540.

Budapest: Scholastica.

Loock, R., Mariaule, M. & Oster, C. 2013. “Traductologie de corpus et

qualité: étude de cas”. Paper presented at the Tralogy 2 Conference,

Paris, France, 17-18 January 2013.

Malmkjaer, K. 1998. “Love thy neighbour: Will parallel corpora endear

linguists to translators?” Meta 43(4): 534-541.

Mason, I. 2001. “Translator behaviour and language usage: Some

constraints on contrastive studies”. Journal of Linguistics 26: 65-80.

Mauranen, A. & Kujamäki, P. (eds). 2004. Translation Universals. Do

they Exist? Amsterdam-Philadelphia: John Benjamins.

Nattinger, J. R. & de Carrico, J. S. 1992. Lexical Phrases and Language

Teaching. Oxford: Oxford University Press.

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

Olohan, M. 2003. “How frequent are the contractions?” Target 15(1): 59-

89.

Ramón, N. & Labrador, B. 2008. “Translation of -ly adverbs of degree in

an English-Spanish parallel corpus”. Target 20(2): 275-296.

Schjoldager, A. 2008. Understanding Translation. Århus: Academica.

Selinker, L. 1972. “Interlanguage”. International Review of Applied

Linguistics 10: 209-241.

Teubert, W. 1996. “Comparable or parallel corpora?” International

Journal of Lexicography 9: 238-264.

Tymoczko, M. 1998. “Computerized corpora and the future of translation

studies”. Meta 43(4): 652-658.

Vinay, J.-P. & Darbelnet, J. 1958/1995. Comparative Stylistics of French

and English: A Methodology for Translation (Translated and edited

by J. C. Sager & M.-J. Hamel). Amsterdam-Philadelphia: John

Benjamins.

Xiao, R. 2010. “How different is translated Chinese from native Chinese?

A corpus-based study of translation universals”. International Journal

of Corpus Linguistics 15(1): 5-35.

This is the final draft. For the published version, please check:

Close encounters of the third code: quantitative vs. qualitative studies in corpus-based translation studies, in Lefer, M.A. and Vogeleer, S., Interference and normalization in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27: 61-86.

1 The terms ‘addition’ and ‘omission’ are taken from Johansson (2007:

26). 2 http://www.frantext.fr

3 http://www.llc.manchester.ac.uk/ctis/research/english-corpus

4 Linguists and grammarians have used two labels in the literature to refer

to English there-sentences, ‘existential’ and ‘presentational’ constructions,

which has led to some confusion. We use here, as in Cappelle & Loock

(2013), the label ‘existential’ to refer to there sentences with the verb be

(in all its possible forms) and use ‘presentational’ for there-constructions

with other verbs. 5 Some electronic corpora of translated French do exist, like the PLECI

corpus (Poitiers-Louvain Echange de Corpus Informatisés) for instance

(see http://www.uclouvain.be/en-cecl-pleci.html), but have remained

modest in size and unavailable for other researchers. A bigger corpus of

translated literary texts is currently being built within the CorTEx project

(Corpus, Translation, Exploration) and is to contain ca. 5 million words. 6 ‘Mutual translatability’ refers to Altenberg (1999)’s definition of the

mutual correspondence (MC) statistical measure, which provides the

frequency with which a pair of items from two languages A and B (here

there-constructions and il y a-constructions) are translated into each other

in a bi-directional corpus. This is calculated as a percentage: ((frequency

in A translated texts + frequency in B translated texts)/(frequency in A

source texts + frequency in B source texts)) x 100. 7 The corpus compiled for Cappelle & Loock (2013) contains full novels

instead of extracts. Although this choice leads to a small number of novels

where other studies use corpora made of extracts from a higher number of

novels, diminishing the risk of author/translator idiosyncrasies, this

allowed a direct comparison between the number of existential

constructions in the English texts as opposed to the number of existential

constructions in the French texts, systematically lower in French than in

English, regardless of the translation direction.