Harry Potter Corpus Design

39

Transcript of Harry Potter Corpus Design

Participants:

•Rebecca Kirkman (French)•Vita Velletri (Chinese)•Weiting Wu (Chinese)

How to Approach the Corpus Design

•Part 1: Considering preliminary translation issues (French - Rebecca)

•Part 2: Considering preliminary translation issues (Chinese)

•Part 3: Explaining our corpus design and our approach to it and our approach to the project

AimTo develop a corpus project that investigates how the position or status of translated literature in the TL system may have an impact on the translation strategies used by translators from different countries.

How?

By building a multilingual translational corpus of Harry Potter into Chinese, French, Arabic and (Mexican) Spanish.

Additional InformationWe need to try to establish links between the translation strategies used with specific linguistic features of the translated texts, such as characters’ names.

Specific Linguistic Features of the Translated Texts

Project Aims• Investigate manifestations of translation universals;

• Investigate these manifestations through Harry Potter texts to discover what forms they take;

• Investigate the differences in translator’s / translation styles in TTs;

• Compare translation styles to see if translation approaches have changed from the first to the seventh book.

Translation UniversalsTranslations have certain universal features that separate them from original texts, and these features are caused in and by the process of translation. Mona Baker has given this issue a lot of attention, and states that the universal features come natural, since “the nature and pressures of the translation process must leave traces in the language that translators produce”.

ExplicitationThe theory of explicitation concerns the tendency in translations to “spell things out rather than leave them implicit” (Baker 1996:180). Explicitation can be expressed syntactically or lexically. For example, translated texts tend to have a higher degree of conjunctions than original texts. Lexical explicitation can be made through various means, but oftentimes it is made by adding nouns in order to explain some piece of information that needs to be explained to a target culture reader.

ExplicitationAnother possible manifestation of explicitation is the fact that translations tend to be longer than their original texts. When translations become longer, the additions to the ST are often made to explain features in the ST that might not be known to readers in a TT-culture. Thus the translation becomes more understandable than a more faithful rendering. This manifestation has the advantage of being relatively easy to examine.

ExplicitationIn this study, explicitation is thought to manifest itself in two ways. Firstly, that the TTs are longer than the STs was evident on a very early stage. Secondly, if more information has been added to the target texts than removed from the source texts, this also indicates that they have been explicitated.

SimplificationSimpli cation is the tendency of fitranslated texts to contain simpli ed filanguage compared to the original text. For example, long sentences are often divided into several shorter ones.One indicator of simplification is a relatively low lexical density, meaning that the number of function words or grammatical words is high, in proportion to the number of lexical words. Lexical words contain more information than grammatical words, and using fewer lexical words means that the reader will have to keep track of less information. Using less variated vocabulary is also one manifestation of simpli cation.fi

Another possible sign of simplification is that punctuation tends to change translations. According to Malmkjaer (1997), punctuation is rateable on a scale from weak to strong in the order comma, semicolon and full stop. In translations, punctuation usually becomes stronger, in that commas are often translated into semicolons or full stops, and semicolons are translated into full stops. If the punctuation is stronger, it is highly likely that there are more sentences in the TT than in the ST, which indicates that long and complex sentences have been divided into several shorter ones, and thereby the complexity of the text has been decreased.

SimplificationIn the HP-corpus, simpli cation fiis assumed to be manifested in long sentences being divided into several shorter ones, stronger punctuation and the removal of the regional dialects that some characters speak in.

NormalisationNormalisation or conservatism is what Baker calls the “tendency to exaggerate features of the target language and to conform to its typical patterns” (1996:183). This can take the shape of the translator over-using clichés or typical grammatical structures of the TL, often grammaticising elements of texts that are ungrammatical in the source.

Anomalies in French• A challenge faced by translators in their work comes from proper names.

• For translators working from English, names must sometimes be created when no official translation exists in order to create and capture the feeling in the original text.

• Numerous examples of this exist in the French translations of Harry Potter.

Harry Potter charactersCrookshanks PattenrondFang CrockdurFawkes FumseckFilch RusardMadame Pomfrey

Madame Pomfresh

Mad Eye Moody Maugrey Fol Oeil

Moaning Myrtle

Mimi Geignarde

Harry Potter charactersNearly Headless Nick

Nick Quasi Sans-Tete

Neville Longbottom

Neville Longdubat

Professor Snape Professor Rogue

Tom Marvolo Riddle

Tom Elvis Jedusor

Scabbers CroutardWormtail Queudver

Changes from ST (E) to TT (F)Hogwarts Poudlard Boggart EpouvantardMuggle Moldu Demento

rDétraqueur

Sorting Hat

Le Choixpeau Ravenclaw

Serdaigle

Diagon Alley

Chemin de Traverse

Hufflepuff

Poufsouffle

Knuts Noises Slytherin

Serpentard

Sickles Mornilles Squib CracmolGalleons Gallions Butterb

eerBieraubeurre

OWLs  

BUSE/Brevet Universel de Sorcellerie Elementaire

NEWTs ASPIC/Accumulation de Sorcellerie Particulièrement Intensive et Contraignante

Harry Potter Characters (Chinese)

English Simplified Chinese Traditional Chinese

1 Harry Potter 哈哈 · 哈哈 哈哈哈哈2 Hermione Granger 哈哈 · 哈哈哈 哈哈 · 哈哈哈3 Ron Weasley 哈哈 · 哈哈哈 哈哈 · 哈哈哈4 Severus Snape 哈哈哈西 · 哈哈哈 哈哈哈哈 · 哈哈哈5 Albus Dumbledore 哈哈哈 · 哈哈哈哈 哈哈哈哈哈哈哈6 Neville Longbottom

哈哈 · 哈哈哈 哈哈 · 哈哈哈

7 Voldemort 哈哈哈 哈哈哈8 Sirius Black 哈哈哈哈哈哈哈 哈哈哈哈哈哈哈9 Draco Malfoy 哈哈哈 · 哈哈哈 哈哈 · 哈哈10 Bellatrix Lestrange

哈哈 · 哈哈哈 哈哈 · 哈哈哈

11 Fred/George Weasley

哈哈哈 · 哈哈哈哈哈 · 哈哈哈

哈哈 · 哈哈哈哈哈 · 哈哈哈

Harry Potter Characters (Chinese)English Simplified Chinese Traditional

Chinese12 Luna Lovegood 哈哈 · 哈哈哈哈 哈哈 · 哈哈哈13 Rubeus Hagrid 哈哈 哈哈・ 哈哈 哈哈・

14 Remus Lupin 哈哈哈哈 哈哈哈 · 哈哈15 Lucius Malfoy 哈哈哈 哈哈哈・ 哈哈哈 哈哈・

16 Dolores Umbrides 哈哈哈哈 · 哈哈哈哈 哈哈哈 · 哈哈哈哈17 Minerva McGonagall

哈哈哈 · 哈哈 哈哈哈哈

18 “Mad-Eye” Moody Alastor Moody

哈哈哈哈哈哈哈哈哈 · 哈哈

哈哈哈 · 哈哈哈哈哈哈哈()

19 Molly Weasley 哈哈 哈哈哈・ 哈哈 哈哈哈・

20 Horace Slughorn 哈哈哈 哈哈哈哈哈・ 哈哈哈 哈哈哈・

21 Gilderoy Lockhart 哈哈哈 · 哈哈哈 哈哈哈 · 哈哈22 Ginny Weasley 哈哈・哈哈哈23 Sibill Trelawney 哈西 哈哈哈・

24 Nymphadora Tonks 哈哈哈哈 · 哈哈哈 哈哈女 哈哈・

Comparisons between ST to TT1/TT2(China/Taiwan)

•What are lost/or “gained” in translation? Problems when using 音音 or transcription: Chinese characteristics / Homophones abound written with monosyllabic logograms Separating writing and pronunciation Translator can manipulate the transcription to add additional meaning.

•Different strategies adapted by different translators?– China tends to preserve the original pronunciation– Taiwan tends to transcribe along the spelling

English(ST) China(TL1) Taiwan(TL2)

Comparison

Minerva McGonagall

哈哈哈 哈哈・

Mǐlèwá Màigé

哈哈哈哈Mài Mǐnàiwá

哈 Mài (Taiwanese version) is a real Chinese surname. Curiously, McGonagall's name is put in Chinese order (surname first) in this version.

Dolores Jane Umbridge

哈哈哈哈 哈哈哈哈哈・・

哈Duōluòléisī Jiǎn Wūmǔlǐqí

哈哈哈 · 哈 · 哈哈哈哈Táolèsī Zhēn Ēnbùlǐjū

The Mainland Chinese translator, rather inaccurately, transliterates as 哈哈哈哈 Wūmǔlǐqí (meaning is 'dark housemaid inside strange', but the characters' function is phonetic.) The Taiwanese rendition is also phonetic, although some meaning can be made out: 哈 ēn means 'kindness, favour, grace'; 哈 bù means 'not'; 哈哈 lǐ-jū means 'address'. The general implication is that there is no kindness or grace to be found with Dolores Umbridge.

Remus Lupin 哈哈哈哈Lúpíng jiàoshòu

哈哈哈 哈哈・

Léimùsī Lùpíng

'Lupin' is rendered phonetically in both versions. The 哈 lú and 哈 lù in 'Lupin' are both surnames in Chinese. The name 哈哈 Lùpíng in the Taiwanese version means 'road is flat'. The Mainland version uses only the name 'Professor Lupin' or 'Lupin', omitting 'Remus'.

Sybill Trelawny

哈哈 哈哈哈哈西 ・

Xībǐ'ěr Tèlǐláonī

哈哈哈哈哈西Xībì Cuīlǎonī

Both phonetic renditions, the Mainland version paying more attention to each individual sound.In the Taiwanese version, 哈 cuī is a Chinese surname. 哈哈 lǎo-nī means 'old girl'. This conceals a pun. The surname 哈 cuī has the same pronunciation as 哈 cuī meaning 'hasten, urge'. The name 哈哈哈 Cuīlǎonī thus means 'fast becoming an old spinster'. This term is often used by students in Taiwan to disparage teachers they don't like..

Rubeus Hagrid

哈哈 哈哈・

Lǔbó Hǎigé哈哈 哈哈・

Lǔbà HǎigéThe 哈 hǎi in 'Hagrid' means 'sea', perhaps an allusion to his vast size? The Mainland name has probably taken the Taiwanese version as its inspiration, as in the case of many other names in Book 1.

Ron Weasley 哈哈 哈哈哈・

Luó'ēn Wéisīlái

哈哈 哈哈哈・

Róng'ēn Wèisīlǐ

For 'Ron', the Taiwanese translator chooses a name that is appealing in Chinese, even if the sound is somewhat different from English. 哈 Róng means 'glory' (it is also a Chinese surname) and ēn means 'endowment, benefit'. The Mainland translator uses the standard Mainland transliteration of 'Ron' using 哈 Luó, also a Chinese surname, instead of Róng. Note: the first character in Weasley, 哈 Wéi or 哈 Wèi, is also a Chinese surname.

Draco Malfoy

哈哈哈 哈哈哈・

Délākē Mǎ'ěrfú

哈哈 哈哈・

Zhuǎi-gē Mǎ-fèn

Malfoy's name is vital in setting the tone. The Mainland transliteration is fairly conventional, using favourable characters such as 哈 dé, meaning 'virtue', and 哈 fú, meaning 'happiness/fortune'. 哈 Mǎ means 'horse'.The Taiwanese version is more creative. 哈 zhuǎi refers to the ways of a person who feels himself a cut above everyone else and looks down on ordinary people. 哈哈 Zhuǎi-gē thus means 'arrogant big brother'. 哈哈 Mǎ-fèn means 'horse part'. (Using different characters, it could also mean 'horse manure', although in the movie, interestingly, the pronunciation is changed to Má-fěn).

Hermione Granger

哈哈 哈哈哈・

Hèmǐn Gélánjié

哈哈 哈哈哈・

Miàolì Gélánjié

In the Taiwanese version, 哈哈 Miàolì, means 'good' and 'beautiful'. The pronunciation is possibly based on the second half of the name, i.e., 'mione'. The Mainland version comes up with a much better rendering: 哈 Hè is a Chinese surname, 哈 mǐn means 'quick/agile'. This could easily be a real Chinese name. Note: Gélánjié is the same in both editions, although this is not obvious because the Mainland characters are drastically simplified.

Fred Weasley

哈哈哈 哈哈哈・

Fúléidé Wéisīlái

哈哈 哈哈哈・

Fúléi Wèisīlǐ

哈哈哈 Fúléidé is standard for 'Fred', but the Taiwanese translator leaves off the dé. This way, both the twins, Fred and George, have two-character names.

Studying Translation: Possibilities

•We need to see Which corpora to choose;How alignment of corpora can be used in translation studies; and

How some complex changes that translators make in translating texts are treated in alignment.

Parallel Corpora• Parallel corpora consist of texts that in some way are parallel. The typical parallel corpus contains original texts written in one language or language variety, and one or more translations of this text into one or more target languages, or language varieties (Borin 2002). The relationship between the text(s) and its translation(s) is one of translation equivalence.

• With parallel corpora, translated text can be studied in a number of ways, but in this study, the point is to discover translation effects. The basic idea behind this concept is that translated text can be linguistically and structurally different from original text, and in what way they differ can be discovered comparing STs with their TTs through the use of parallel corpora.

Sentence Alignment• Alignment of the corpus texts is a process performed on parallel corpora. Aligning a corpus is “the process of identifying and pairing up corresponding units in the two (or more) languages making up the parallel corpus”. This can be done on different levels, for example sentence alignment and word alignment.

• In sentence alignment pairs of more or less equivalent source and target sentences are by some means put next to each other, which can be done by using simple tables. This is done to discover the most obvious changes to the text, such as elements of meaning being transferred to another sentence in the TT, long sentences being translated as several short ones, and extensive omissions and additions.

• For corpora like the HP-corpus, sentence alignment can be done quite easily using basic word processing software such as Microsoft Word. For larger collections of text, automatic tools are necessary.

Point Of Speech Tagging• POS tagging is done because keeping track of the structural information of words and other text components is relevant. In translation, words and segments of a source text will sometimes change word class or have another function in the target text. The voice can also change, from passive to active or vice versa. These small linguistic changes can be indicators of more wide-spanning changes done to the text, which makes them liable for further investigation.

• Modern language processing tools such as the Machinese Syntax by Connexor uses functional dependency grammar to POS-tag corpora automatically.

Word Alignment•To be able to discover when the corresponding word is of a different word class in the TT than in the ST, the texts must be aligned on the word level. The ST word (or words) must be linked with the corresponding TT word (or words), and for this task, specialised software tools are required.

Name of Corpus

Languages

Size and Date

Main Research Question

Source of samples of texts (translation or original text)Corpus construction procedureWhat type of Corpora would you like to construct (refer to corpus typology)How would you use the corpus

Would you use large-scale reference corpora like BNC/COCA/ZJU Corpus of Translational Chinese?Why?

MethodologyThe sequence of word for this corpus project would be best performed as:1.The texts to be included in the

corpus are chosen and read, and a decision made on a suitable size of the samples.

2.The sample texts transferred to electronic form by scanning.

3.The samples are then aligned manually on a sentence level, rather than words.

4.Software like Connexor is used to supply POS tags to all tokens in the sample.

5. The POS-tagged samples are then word aligned using software tools.6. The word-aligned samples are studied using software programs and the results are analysed.7. A small-scale case study is made on elements like, for example, the dialects of Hagrid and Stan Shunpike.8. An investigation is made on sentences to search for and analyse translation universals in close detail.

Name of Corpus Translational Corpus of Harry Potter

Languages English, French, Chinese, Mexican Spanish

Size and Date HP 1 -7

Main Research Question How are the original English novels

Source of samples of texts (translation or original text)

Harry Potter texts in SL and TLs

Corpus construction procedure See methodology

What type of Corpora would you like to construct (refer to corpus typology)

Parallel corpora

How would you use the corpus To study translation equivalence and effects by creating sentence-aligned corpora.

Would you use large-scale reference corpora like BNC/COCA/ZJU Corpus of Translational Chinese?

Possibly for Chinese, maybe but not for other languages no.

Why? We need tools that encompass European languages.