Identifying useful phonetic components of kanji

40
Japanese Language and Literature Journal of the American Association of Teachers of Japanese October 2013 Volume 47, Number 2 CONTENTS Articles The Word Monosugoshi and Changing Perceptions of Nature in Medieval Japan ................................... Paul S. Atkins 159 New Chinese Immigrants in Japan: Cultural Translation and Linguistic Hybridity in Yang Yi’s and Mao Danqing’s Japanese-Language Writing ..... Lianying Shan 193 Identifying Useful Phonetic Components of kanji for Learners of Japanese ..................................................................... ....................... Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 235 Reviews G. G. Rowley—An Imperial Concubine’s Tale: Scandal, Shipwreck, and Salvation in Seventeenth Century Japan ................................................ Satoko Shimazaki 273 Gustav Heldt—The Pursuit of Harmony: Poetry and Power in Early Heian Japan .......................... Thomas Lamarre 278

Transcript of Identifying useful phonetic components of kanji

Japanese Language and Literature Journal of the American Association of Teachers of Japanese

October 2013 Volume 47, Number 2

CONTENTS Articles

The Word Monosugoshi and Changing Perceptions of Nature in Medieval Japan ................................... Paul S. Atkins 159 New Chinese Immigrants in Japan: Cultural Translation and Linguistic Hybridity in Yang Yi’s and Mao Danqing’s Japanese-Language Writing .....Lianying Shan 193 Identifying Useful Phonetic Components of kanji for Learners of Japanese ..................................................................... ....................... Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 235

Reviews

G. G. Rowley—An Imperial Concubine’s Tale: Scandal, Shipwreck, and Salvation in Seventeenth Century Japan ................................................Satoko Shimazaki 273

Gustav Heldt—The Pursuit of Harmony: Poetry and Power in Early Heian Japan .......................... Thomas Lamarre 278

Peter Flueckiger—Imagining Harmony: Poetry, Empathy, and Community in Mid-Tokugawa Confucianism and Nativism............................... Wilburn Hansen 286

Kuwamura Takeshi—Nippon Wars and Other Plays.........................................................................M. Cody Poulton 291

Kimi Kondo-Brown, Yoshiko Saito-Abbott, Shingo Satsutani, Michio Tsutsui, and Ann Wehmeyer (Eds.)—New Perspectives on Japanese Learning, Linguistics, and Culture ..............................................Mari Noda 296

Contributors ................................................................................. 303

Japanese Language and Literature 47 (2013) 235–272 © 2013 Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano

Identifying Useful Phonetic Components of kanji for Learners of Japanese

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano

Abstract

The role of kanji phonetic components in the reading of Japanese words written with multiple kanji characters (kanji words) has been discussed for decades. Despite relevant research suggesting the potential usefulness of the phonological information provided by phonetic components in reading kanji words, they are being under utilized in the learning/teaching of the Japanese language as a second/foreign language due to the complex nature of phonetic components, which exhibit varied degrees of reliability and applicability. This project evaluates the usefulness (reliability and applicability) of phonetic components in reading kanji words likely to be encountered by learners of Japanese as a second/foreign language. In this paper, we argue for the need to identify phonetic components that are useful for learners of Japanese, especially for those learners with an alphabetic background, and we demonstrate which phonetic components warrant spending time to learn/teach them by analyzing the reliability and applicability of each phonetic component that learners are likely to encounter. To this end, we create a comprehensive database comprised of 174 potentially useful phonetic components. From the data, we identify 119 useful phonetic components for learners of Japanese in reading kanji words.

Introduction The role of phonetic components (i.e., sub-character units that carry phonological information) in Chinese characters has been discussed on and off for decades. Of late, the didactic use of phonetic components in Chinese characters in second/foreign language instruction has again risen in popularity, in light of an increased demand for efficient Asian language instruction in the “Asian Century.” It is widely known among teachers and researchers of Japanese as a second/foreign language that learning kanji is one of the most difficult aspects in Japanese language

236 Japanese Language and Literature

learning (e.g., Mori 1999, Yamashita and Maru 2000). Toyoda (2007), in one of her chapters, reviews the pedagogical literature on methodologies for kanji instruction, and concludes that providing learners with knowledge of functional components, such as radicals and phonetic components, is essential for the efficient learning of the Japanese script.

This project is an attempt to evaluate the utility of phonetic components in reading words written with multiple kanji (hereafter, kanji words) that learners are likely to encounter, specifically, in deriving or retrieving the pronunciations of unfamiliar or poorly memorized kanji occurring in words. The term “utility” in this paper is used to refer to both reliability and applicability. The terms, “usefulness” and “useful” are also used in the same semantic category, covering both reliability and applicability. “Reliability,” in this paper, indicates the level of confidence the reader can place in a particular pronunciation being correct when reading words. Reliability is sometimes discussed elsewhere in terms of two separate aspects: validity (whether a phonetic component indicates the pronunciation of the kanji) and consistency (whether all kanji with the same component have the same pronunciations) (see Saito, Masuda, and Kawakami 1998, 1999 for details). However, in this paper, these two aspects are not treated separately. “Applicability” is a new term used in this paper, which indicates how frequently and how commonly the knowledge of the phonetic component can be applied in reading words.

The utility (reliability and applicability) of phonetic components for learners may vary according to the stage of Japanese language learning and the area of interest of individual students. In this paper, however, utility is assessed not as something that applies at a particular stage of Japanese language learning or in a particular domain of discourse, but rather, during the entire period of learning, and in various domains. For practical purposes, therefore, in this study, the term “learners” is defined as general learners who encounter words in various texts in various domains during the entire period of their study of Japanese, from beginning to advanced levels. The utility of phonetic components is discussed with a focus on such learners.

The findings of the present study will provide valuable information to Japanese language learners, particularly those whose language is orthographically distanced from Japanese, and to Japanese language instructors who wish to systematically incorporate information regarding phonetic components into their vocabulary instruction.

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 237

Phonetic components give clues to the pronunciation of the kanji in which they occur (Xing 2006). More precisely, a given phonetic component will indicate one or more pronunciations, one of which may suggest the pronunciation of the kanji (Toyoda 2007). For non-Japanese speakers, particularly for learners with no exposure to kanji prior to learning Japanese, the use of phonetic components can be helpful in deriving or retrieving the pronunciations of unfamiliar or poorly memorized kanji, and it may lead to more efficient learning than rote memorization (e.g., Noguchi 1995). However, too much emphasis on the usefulness of phonetic components is apt to make learners confused and frustrated. One learner wrote the following after receiving instruction in regard to phonetic components:

While the kanji do share phonetic relationships, most of the time there are nothing but exceptions, and the differences are far too many for it to be of any use. Recognizing the phonetic component of a character isn’t difficult. There may be some details of this “technique” that I don’t know, but most of the time I can recognize the phonetic component just fine. The problem is that there’s no reasonably reliable way of guessing the pronunciation, making the phonetics essentially worthless to someone who hasn’t specifically learnt each character. In my experience, they are useful for remembering a reading one has already learned, and not a lot more.1

Let us explain the problem pointed out by this learner by using the following example. The kanji 予, 預, 野, and 序 all share a common phonetic component 予. While 予 and 預 are pronounced yo in kanji words, the kanji 野 and 序 are pronounced ya and jo, respectively. While the knowledge of this phonetic component and its pronunciation (i.e., yo) may be useful when trying to recall already learned kanji (e.g., 預), such knowledge is of negligible use for predicting the pronunciation of unlearned kanji in which the component occurs (e.g., 序).

As evidenced above, some phonetic components may not be worth learning. Thus, it is essential to identify useful phonetic components, as not all phonetic components are equally helpful due to varied degrees of utility, as measured by reliability and applicability.

Kano (1993) and Townsend (2011) analyzed the pronunciations among kanji sharing a common phonetic component, and ranked phonetic components according to their utility, as measured by the number of kanji sharing a common phonetic component with a common or similar pronunciation(s). One weakness of their lists is that the

238 Japanese Language and Literature

phonetic components are analyzed at the kanji character level, not at the kanji word level, focusing only on the phonetic relationships between the kanji characters sharing a common phonetic component. In such a ranking, a phonetic component with a single pronunciation that occurs in multiple kanji characters would certainly seem promising. However, it may not be worth learning if the kanji characters in which that particular phonetic component occurs are used only in kanji words rarely occurring in texts. When learning kanji in a non-classroom situation, learners of Japanese typically encounter kanji embedded in words, not in isolation. Thus, the applicability of phonetic components for reading words needs to be taken into consideration.

In this study, we examine the utility of phonetic components, not only at the kanji character level, but also at the kanji word level, and identify useful phonetic components for learners of Japanese as a foreign language. To our knowledge, this is the first attempt to investigate the utility of phonetic components in deriving or retrieving the pronunciations of unfamiliar or poorly memorized kanji occurring in words that learners of Japanese encounter.

Previous research A phonetic component is a sub-character unit carrying phonological information that may be useful in deriving or retrieving a particular type of pronunciation called on-yomi (see below) for a specific type of Japanese kanji character called a “phonetic compound” (Tamaoka 1991). Until recently, there were 1,945 officially recognized “everyday-use kanji” (In 2010, the list was revised to include an additional 196 characters, and to remove 5 characters, resulting in a total of 2,136 characters, see Bunkachō 2010). According to Kaiho and Nomura (1983), 726 of the original 1,945 everyday-use characters (37.3%) have a single pronunciation, and the rest have two or more pronunciations. These pronunciations could be one of two types, on-yomi (pronunciation of Chinese origin) and kun-yomi (pronunciation of Japanese origin). Of the 1,945 kanji, 694 have only one on-yomi, and 32 have only one kun-yomi and no other pronunciation (Kaiho and Nomura 1983). About 60% have both on- and kun-yomi, 38% have multiple on-yomi and no kun-yomi, and 2% have multiple kun-yomi and no on-yomi (Tamaoka 2003). These statistics show that almost all kanji among the 1,945 have at least one on-yomi (Nomura 1981).

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 239

Among those kanji with on-yomi, 66% are so-called “phonetic compounds,” comprising a semantic component and a phonetic component (Tamaoka 1991) and possibly other components. The phonetic component is sometimes called a “phonetic” or a “phonetic element” (e.g., Chen, Wu and Anderson 2003, Shu and Anderson 1998). Some researchers call it “phonetic radical” or “phonemic radical,” in contrast to “semantic radicals” (e.g., Flores d’Arcais, Saito and Kawakami 1995, Taft and Zhu 1995, Wang 1981). In this paper, we use “phonetic component,” avoiding the term “radical” or “element.” Phonetic components in the current kanji system differ, in most cases, from what are widely known as “radicals” (e.g., in the kanji 級, the left hand side 糸 is the radical of this kanji, and the right hand side 及 is a phonetic component). A phonetic component can comprise more than one element (e.g., the phonetic component of the kanji 指 is 旨, which is composed of 匕 and 日).

It is claimed that approximately 700 phonetic components exist in the entire set of Japanese kanji (Koda 2002), with approximately 400 in the original 1,945 everyday-use kanji (Chigusa 2008). However, for various reasons, such as variations in pronunciation at different times of introduction into Japan, and differences between the Chinese and Japanese phonetic systems (Tamaoka 1991), occasions where a phonetic component represents the exact pronunciation of the whole kanji are rare. If we consider the entire range of phonetic components, from totally unreliable to very reliable, the average phonological consistency is estimated to be about 40% (Koda 1999).

Despite such limited reliability, research indicates that a knowledge of phonetic components serves as a fundamental element of overall kanji knowledge (Tamaoka and Yamada 2000, Leong and Tamaoka 1995). Cognitive linguistics studies suggest that the phonological information offered by phonetic components becomes available to skilled readers at a very early stage of character recognition both in Japanese and Chinese (Koda 2002, Geva and Wang 2001), and that readers cannot inhibit the activation of the phonological information of phonetic components even when such information is unnecessary (Saito et al. 1998). Moreover, the automatic activation of phonological information extends beyond the level of single characters. During the recognition process of characters, not only does the phonology of the phonetic component of the target character become available, but also that of all the other characters sharing the same phonetic component (Masuda and Saito 2002, Taft and

240 Japanese Language and Literature

Zhu 1997). It has been reported that native speakers of Japanese are able to selectively use such information in reading (Toyoda 2009).

Learners of the Japanese language do not possess a knowledge network of kanji sharing a common phonetic component when they start to learn to read in Japanese. Research (e.g., Koda 1995, Chikamatsu 1996) suggests that, in facing a new type of script, learners with an alphabetic background tend to rely on the phonological processing skills acquired through learning to read in their first language, while searching for new strategies to deal with the new script. Mori (1998) conducted an interesting experiment investigating the transfer of word processing skills from the first language to a second/foreign language, using artificial kanji with and without pronunciation markers (katakana) embedded within them. An alphabetic group (Americans) and two non-alphabetic groups (Chinese and Koreans experienced with kanji) were asked to identify the target artificial kanji after studying a list of those with or without the pronunciation markers (katakana). The alphabetic group found that phonologically equipped artificial kanji (i.e., those containing the pronunciation markers) were easier to remember than the phonologically devoid ones (those without the pronunciation markers). Conversely, the non-alphabetic group showed virtually no difference in recognizing the two different types of non-kanji.

The above study suggests that reliance on phonological information is stronger in alphabetic readers than in non-alphabetic readers. If kanji always have clear phonological information, as in the case of the pronunciation-marker-embedded artificial kanji in Mori’s study, learners of Japanese with an alphabetic background may be able to attain a high level of kanji word reading competence through their primary phonological processing. However, the reality is that kanji are, in most cases, not phonologically transparent. For learners of Japanese with an alphabetic background, the familiar method of phonological processing is not always readily applicable, and as a consequence, they may be left with an inability to effectively deal with phonologically devoid non-alphabetic script.

Phonetic components of kanji have the potential to assist learners of Japanese, particularly those with an alphabetic background, in reading kanji words. Using both timed tasks and interview, Toyoda (2009) investigated whether learners of Japanese developed an awareness of the usefulness and limitations of the information supplied by phonetic components, and the processing skills to use them selectively when

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 241

necessary. The results showed an increasing awareness of, and ability to make use of, phonetic components, and that such awareness and ability increases in direct proportion to the increase in the overall level of language proficiency. However, the selective use which native speakers of Japanese reported in the interviews was not evident in the learners.

In response to these research findings, educators began to proclaim the importance of teaching phonetic components, and a number of dictionaries and textbooks focusing on phonetic components were published (e.g., Ishizawa 2012, Kano 1998, Shirakawa 2007, Miyashita 1991 2000, Yamamoto 2007). Many of these publications are for native speakers of Japanese, either Japanese children learning kanji or adults who would like to improve their knowledge of kanji.

The Study of Kanji (Pye 1971) was the first book with an emphasis on phonetic components targeting non-Japanese learners of Japanese. In this book, the kanji that contain “a common element of form” (i.e., phonetic component) are arranged in groups. Remembering the Kanji 2 (Heisig 1987) takes a similar approach. Heisig analyzed the pronunciations of kanji using what he called “primitive signals,” which are similar to, but not exactly the same as, phonetic components. Following the publication of this book, the benefits of learning phonetic components for learners have been emphasized by many researchers (Kaiser 1996; Kano 1993; Toyoda 2001, 2009), and some even suggest that learning phonetic components makes it possible to learn kanji effortlessly (Noguchi 1995, Townsend 2011).

The above-mentioned publications, including the research articles, put emphasis on the positive side of the use of phonetic components. However, as discussed above, it is undeniable that some phonetic components are not worth learning. On the other hand, it is also true that there are some that are useful. It is therefore necessary to examine each individual phonetic component from various aspects in order to find which ones are useful and why. If learners of Japanese lack the ability to selectively use phonetic components, as suggested by Toyoda (2009), it is crucial to provide them with a list of useful phonetic components. To date, research examining the properties of individual phonetic components is scarce, and moreover, research investigating the role of those phonetic components at the word level is rare or non-existent. In order to assist learners reading kanji words, detailed analysis of the properties of individual phonetic components and the identification of those with high utility for reading kanji words is called for.

242 Japanese Language and Literature

Creation of the database In this section, we describe the creation of a database of phonetic components and the identification of those that are useful to learners of Japanese as a second/foreign language for reading kanji words. In the process of including phonetic components in the database, since our aim is to assist non-Japanese learners, those kanji that have the same physical features (shapes) as certain phonetic components, such as 長, as well as phonetic compound kanji are counted as kanji items. This is because a knowledge of phonetic components can be useful in either type of kanji. By the same token, we focus on the practical relationship between physical features (shapes) of the components and the pronunciations of the components in kanji used in words, more than on the historical relationships between orthographic components and pronunciations. As a case in point, in the kanji 島, the historical phonetic component is 鳥, but the modern character form (as opposed to the old form 嶋) obscures this relationship and makes it unreasonable to treat 鳥 as the phonetic component of 島.

In our database, all the phonetic components are displayed such that their shape is as near as possible to standard. However, in some cases, due to limitations of available fonts, shapes are slightly altered (e.g., 兌 : the top two strokes should narrow at the bottom rather than at the top, as in the rightmost component of 説). Where a phonetic component cannot be reproduced in isolation on the computer, it is shown embedded in a kanji (e.g., 珍 , where the rightmost component, i.e., not 王, is the phonetic one) .

In order to identify useful phonetic components for learners, we first extracted phonetic components that learners are likely to encounter. Second, we collated factors likely to affect the degree of utility, and third, we quantified the phonetic components in relation to those factors.

The phonetic components included in our database are those that: 1. occur in three or more kanji within the current 2,136 everyday-use kanji officially recognized by the Japanese Agency for Cultural Affairs, and 2. occur in words within the Vocabulary Database for International Students learning Japanese created by Matsushita (2011).2

A digression is necessary to explain this database created by Matsushita, as it is relevant to our database. Matsushita Vocabulary Database for International Students Learning Japanese contains information on the 20,312 words that learners are likely to encounter in reading texts (both

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 243

formal and informal/colloquial) during the period from the elementary stage to a highly advanced stage of Japanese language learning (hereafter, everyday-use words for learners). The contained words were extracted from a large corpus called Balanced Corpus of Contemporary Written Japanese 2009 Monitor Version (created by the National Institute for Japanese Language and Linguistics), which compiles words from books in various genres and internet forum sites (see Matsushita 2012 for details). For each word in the database, the following information is included: script type, pronunciation, degree of importance, and word level. Matsushita (2012) ordered the words according to their importance indexed by an adjusted frequency called U (a product of frequency of occurrence F and dispersion D, U =F*D). The dispersion indicates how widely the word is distributed across different domains of discourse (Matsushita 2012: 54). Higher frequency words that are used across several domains of discourse (i.e., higher adjusted frequency words) have higher text coverage than lower frequency words that are used in limited domains (i.e., lower adjusted frequency words), and therefore the former are ranked higher (i.e., more important) than the latter. In the Matsushita database, a smaller number shows a higher degree of importance. Matsushita (2011) grouped the everyday-use words for learners into five levels: basic, intermediate, advanced, high advanced, and super advanced, primarily based on the degree of word importance. The words of higher degree of importance were placed into more fundamental word level because the higher adjusted frequency words are more beneficial for learners of Japanese than those with the lower adjusted frequency. However, some adjustment was necessary for the basic level. This is because even some words that have relatively low adjusted frequencies could be valuable words for learners (e.g., greeting words). For this reason, the word levels were adjusted according to the table of word difficulty levels set in the word list for the standardized Japanese Proficiency Test, which was initially developed in 1993 by the Japan Foundation and the Association of International Education (see Matsushita 2012 for details). Matsushita’s five word levels play a crucial role in our database of phonetic components.

Returning to the main topic, for our database, only the everyday-use words for learners with phonetic compound kanji (i.e., kanji containing a phonetic component) and/or with those kanji that have the same features as a phonetic component are extracted from the Matsushita database. In those cases where the kanji in the words are pronounced using kun-yomi,

244 Japanese Language and Literature

the words are discarded because phonetic components are indications of on-yomi only. In addition to this, words that could be written in kanji but, in the Matsushita database, are marked as being conventionally written in hiragana (e.g., 多分 vs. たぶん), have been excluded from our database. As a result of this exclusion, the applicability of some phonetic components may be underestimated. However, the absence of information regarding word level and/or adjusted frequency in each form (kanji and hiragana) militated against their inclusion.

In order to determine which phonetic components to include, we consulted the Kanji Phonetic Component Dictionary [Kanji Onpu Jiten] (Yamamoto 2007), a research paper containing a list of kanji with phonetic components (Chigusa 2008), a phonetic component list for Japanese children (Den 1999), and a list of kanji with phonetic components for learners (Kano 1993).

Each of the above reference materials analyze phonetic components at the individual kanji level. Being shared by many kanji is doubtlessly a significant aspect in judging the utility of phonetic components. For example, according to Chigusa (2008), the phonetic component 青 occurs in seven kanji, and therefore this phonetic component is potentially useful. On the other hand, the phonetic component 尉 occurs only in two kanji, in which case, this phonetic component is obviously less useful than the former.

Based on the above-mentioned lists, we therefore initially selected 183 phonetic components, all of which are shared by three or more kanji and occur in Matsushita’s everyday-use words for learners. The phonetic components that, according to the above-mentioned reference materials, occur only in one or two kanji are not considered for inclusion in our database, due to the limited applicability of such phonetic components. In the process of finding words containing kanji with a phonetic component, we discovered that some kanji listed in Chigusa (2008) are not part of any on-yomi words listed as everyday-use words for learners in the Matsushita database. Those kanji are 詠, 猿, 禍, 挟, 窟, 銑, 霜, 袋, and 霧. In other words, learners of Japanese have almost no chance to encounter these kanji where they should be read in on-yomi. After removing the above kanji, the total number of kanji with any of the respective phonetic components, 永 , 袁 , 咼 , 夾 , 屈 , 先 , 相 , 代 is reduced to two (i.e., low applicability). As a result, these phonetic components are excluded from the final list of phonetic components, and thus our database contains a total of 174 phonetic components.

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 245

The number of kanji in which the phonetic component occurs is one factor affecting the phonetic component utility, as suggested by previous studies. However, a more crucial factor is the number of words in which the phonetic component occurs, and the word levels of those words, since learners normally encounter phonetic components within words, not in isolated kanji. In the Matsushita’s 20,312-word database, 20,000 everyday-use words for learners are classified into five levels (basic, intermediate, advanced, high advanced, and super advanced) according to the word importance, into which are incorporated both frequency of appearance and dispersion across domains. The remaining 312 words fall outside of this classification (see Matsushita 2011 for details). Among the words included in our own database, 95 fall into this unclassified category. These 95 words are therefore included in our database as a reference, but are not part of the calculation of applicability (the calculation of applicability will be discussed below). For our database, in accordance with the Matsushita database, the words in which the selected phonetic components occur are classified into five word levels—basic, intermediate, advanced, high advanced, and super advanced.

Even if a phonetic component occurs in a number of kanji, the phonetic component may not be useful if, for instance, the kanji only appear in a few specific technical words. When we investigate the utility of phonetic components, it is necessary to consider the degree of applicability of the knowledge of the phonological component. As mentioned above, “applicability” indicates how frequently and how commonly the knowledge of the phonetic component can be applied in reading everyday-use vocabulary that learners of Japanese encounter.

We propose to add the following four factors:

Factor 1. Within the range of the everyday-use vocabulary for learners, how applicable is knowledge of the phonetic component in words within which the phonetic component occurs?

The number of words in which the phonetic component occurs is relevant in determining the utility of phonetic components. However, simply counting the number of words is not sufficient. In addition to the number of words, the words’ levels need to be taken into account. One phonetic component may occur in frequently appearing, widely used words (i.e., occurring in many words that are at the basic or intermediate level). Another may occur in infrequently used words or technical words used only in a certain domain (i.e., occurring in a few words that are at

246 Japanese Language and Literature

the advanced or higher level). According to the Matsushita database, in the case of the phonetic

component 青 (as embedded in kanji appearing in on-yomi words), in total it occurs in 88 words of different word levels. For example, 情報 ‘information’ is an intermediate-level word that is commonly used in everyday life, whereas 血清 ‘blood serum’ is a super-advanced word that is used in a particular field and thus not often seen. For this reason, words need to be weighted according to their applicability, that is according to the number of words in which the phonetic component occurs and the level of the words. The phonetic component 青 occurs in 88 words, and the total weighted score of the words is 200 (See Appendix 1, second table).

Although it is lengthy, we offer the following explanation of how the calculation was done, as it is an important aspect of our database. Using the component 青 as an example, our calculation involves the following three stages: First, a weighted score for each kanji containing the phonetic component 青 is calculated. Altogether there are 12 kanji to consider. The number of kanji in which this phonetic component occurs is, in fact, seven (精, 清, 静, 請, 青, 晴, 情) if we ignore how these kanji are pronounced within words. However, because we examine the applicability of phonetic components in words, not in characters, we treat the kanji with different pronunciations as separate items, thus resulting in 12 total kanji. These 12 kanji are 精 sei, 清 sei, 静 sei, 請 sei, 青 sei, 晴 sei, 静 jō, 情 jō, 清 shin, 請 shin, 情 zei, and 精 shō. Next, we count the number of words in which each of these 12 kanji with the phonetic component 青 occurs. In regard to the first item (i.e., the kanji 精 sei), for instance, there are 19 words. These 19 words consist of one intermediate (精神), five advanced (e.g., 精密), eight high-advanced (e.g., 精通), and five super-advanced words (e.g., 精鋭). Because learners are more likely to encounter words of lower levels than words of higher levels, we need to weight these 19 words according to their respective word levels. Therefore, the lower the level of the word, the more weight we assign (i.e., basic words x 5, intermediate x 4, advanced x 3, high advanced x 2, and super advanced x 1). In regard to 精 sei, for instance, we sum these weights (1 word x 4 + 5 x 3 + 8 x 2 + 5 x 1) and arrive at a score of 40 for this kanji with this pronunciation. The same calculation applies to the other 11 kanji.

The second stage is to derive a score for each pronunciation. As shown above, the phonetic component 青 has five possible

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 247

pronunciations, sei, jō, shin, zei, and shō (as in, for example, 晴天 sei-ten, 静脈 jō-myaku, 日清 nit-shin, 風情 fu-zei and 精進 shō-jin). Each of these pronunciations is shared by several kanji. For example, the pronunciation sei is shared by six kanji: 精, 清, 静, 請, 青, and 晴. The numbers of words and the scores for the respective kanji are: 19 words with a score of 40 (see the above calculation), 12 words with a score of 19, 11 words with a score of 22, 4 words with a score of 13, 4 words with a score of 12, and 2 words with a score of 4. The total number of words is 52 (19 + 12 + 11 + 4 + 4 + 2), with a total score of 110 (40 + 19 + 22 + 13 + 12 + 4). The same procedure is repeated for the remaining four pronunciations, that is, jō, shin, zei, and shō.

The third stage is to get the total number of words and eventually the total score for 青 . The number of words for each of the five pronunciations are: 52 (for sei), 31 (for jō), 3 (for shin), 1 (for zei) and 1 (for shō), totaling 88 (52 + 31 + 3 + 1 + 1). The scores of words sharing each of the respective five pronunciations are 110 (for sei), 77 (for jō), 8 (for shin), 3 (for zei), and 2 (for shō). Adding up these pronunciation scores gives the total score for this particular phonetic component, which is 200. This is the total word score for the phonetic component 青. In our database, the applicability of the knowledge of phonetic components in words is shown in weighted scores such as this. The higher the weighted score, the more likely the phonetic component is to be useful.

When we examine properties of phonetic components at the word level, it becomes clear that we also need to include the following factor in addition to Factor 1, mentioned above:

Factor 2. Within the range of the everyday-use vocabulary for learners, how many pronunciations does the phonetic component represent?

The number of pronunciations a phonetic component suggests in words is another essential factor. In some cases, a phonetic component represents only one pronunciation no matter where it occurs, which is a clear indication of the phonetic component being useful. For example, the phonetic component 長 (see Appendix 1, first table) represents only one pronunciation chō. In other cases, one phonetic component can represent multiple pronunciations, depending on the word in which it occurs. In such cases, the phonetic component may confuse learners. For this reason, the number of pronunciations that a phonetic component represents needs to be examined. In previous studies (Kano 1993, Townsend 2011), pronunciations of a phonetic component were

248 Japanese Language and Literature

examined only at the kanji level. However, the number of pronunciations of a phonetic component at the word level is not always equal to the number of pronunciations shown in kanji dictionaries for that. Therefore, it is critical that pronunciations of a phonetic component be examined at the word level.

Let us explain by again using the phonetic component 青 as an example (see Appendix 1). Here are the seven kanji which contain this phonetic component, together with the respective on-yomi, as specified in kanji dictionaries: 精 sei, shō, 清 sei, shō, shin, 静 sei, jō, 請 sei, shin, shō, 青 sei, shō, 晴 sei, 情 jō, sei. All of these kanji have more than one on-yomi, and the total number of on-yomi variations shown in dictionaries are four: sei, shō, shin and jō. However, at the word level, within Matsushita’s everyday-use vocabulary for learners, 情 has no word where it is pronounced sei. However, it does have a word in which it is pronounced zei: 風情 fu-zei. The kanji 清, 請, and 青 have no word within Matsushita’s word list where they are pronounced shō, despite this being a pronunciation listed in dictionaries.

Thus, in order to grasp precisely how many pronunciations one phonetic component represents in words, we need to look at how the kanji with the target phonetic component are pronounced in words, rather than how the kanji with the phonetic component are pronounced at the level of individual kanji. In our database, unlike previous studies that only looked at the pronunciations of the kanji listed in kanji dictionaries, we show the actual pronunciations of the kanji within words. Thus, for our database, the number of pronunciations of a phonetic component is not equal to the number of pronunciations of the kanji shown in kanji dictionaries. It is the number of pronunciations of a phonetic component in kanji, which are within words. A phonetic component may represent similar pronunciations, and in some cases, these variations can be explained by the phonological rules of the Japanese language. However, since we cannot expect learners to know these phonological rules, in our database we treat these variations as different pronunciations.

In the case of the phonetic component 青, there are five possible pronunciations; sei, jō, shin, zei, and shō. Compared to this phonetic component, the phonetic component 長 , which has only one pronunciation, chō, is likely to be more useful. In other words, the lower the number of possible pronunciations, the more useful the phonetic component.

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 249

In addition to the above factors, a factor concerning the reliability of each of the suggested pronunciations also affects the utility of phonetic components as well. Please recall that the term “reliability” in this paper is used to indicate the level of confidence the reader can place in a particular pronunciation being correct. This third factor is:

Factor 3. Within the range of the everyday-use vocabulary for learners, how reliable is the phonological information suggested by each phonetic component?

In some cases, a phonetic component suggests only one pronunciation regardless of the word in which it occurs. If the phonetic component represents only one pronunciation, it is pronounced with that pronunciation 100 % of the time. That is, the pronunciation is 100% reliable. For example, the phonetic component 長 (see Appendix 1), has three kanji in which it occurs: 長, 帳, and 張. These kanji are always pronounced chō. On the other hand, a phonetic component may suggest more than one pronunciation. Where there are multiple pronunciations, one may predominate. For example, let us say that a certain phonetic component has several pronunciations. One particular pronunciation of this phonetic component may occur in a number of words or in words of lower levels, whereas another pronunciation may occur in only a few words or in words of higher levels. In such a case, the former pronunciation rather than the latter, is likely to be the predominant pronunciation, and hence more reliable in reading the target word.

Once again taking the phonetic component 青 as an example (see Appendix 1), there are five possible pronunciations: sei, jō, shin, zei, and shō. Among them, sei is the most reliable pronunciation for this phonetic component. The reliability of the phonological information is calculated by the “weighted score of the words for each pronunciation” divided by the “weighted score of the words for each phonetic component,” multiplied by 100. The higher this number, the more reliable the phonetic component. In the case of the phonetic component 青, as illustrated in the Factor 1 section, the weighted score of the five pronunciations sei, jō, shin, zei, and shō are 110, 77, 8, 3, and 2, respectively. The total weighted score of this phonetic component is 200. Therefore, the reliabilities of the five pronunciations are 55% (110 / 200 *100), 38.5% (77 / 200 * 100), 4% (8 / 200 * 100), 1.5% (3 / 200 * 100) and 1% (2 / 200 * 100), respectively. None of them is highly reliable, but the first pronunciation, sei, is more reliable than the others.

250 Japanese Language and Literature

One may think that the reliability decreases as the number of pronunciations increases. However, this is not always the case. As mentioned, although there may be several pronunciations, one may predominate. As a case in point, the phonetic component 果 has four pronunciations, but one of the pronunciations, ka, exhibits an 85.5% reliability. In regard to Factor 3, where there is more than one pronunciation, our focus is on the most reliable pronunciation of the phonetic component. This is because if there is a highly reliable pronunciation, the phonetic component is likely to be useful, as the learner is likely to be correct in reading with that pronunciation.

The last, but very important factor to be examined is:

Factor 4. Within the range of the everyday-use vocabulary for learners, how many kanji share a particular pronunciation of a certain phonetic component?

The number of kanji in question is the number of kanji for each of the pronunciations of a phonetic component, which may differ from the total number of kanji sharing a phonetic component. In cases where all the kanji sharing a phonetic component are pronounced with only a single pronunciation no matter where they occur, the number of kanji in question is obviously the same as the total number of kanji sharing the phonetic component. For example, as mentioned, the phonetic component 長 (see Appendix 1) has three kanji in which it occurs: 長, 帳, and 張. These kanji are always pronounced chō. That is, for this component 長, three kanji share the same pronunciation.

Where a phonetic component occurs in several kanji, and represents more than one pronunciation (i.e., the kanji sharing the phonetic component are pronounced differently in different words), what matters is how many kanji are assigned to each of the pronunciations. For example, within words the phonetic component 青 (see Appendix 1) occurs with five different pronunciations: sei, jō, shin, zei, and shō. The following six kanji, 精, 清, 静, 請, 青, and 晴, may all be pronounced sei in certain words. On the other hand, two of these kanji, 静 and 情, may also be pronounced jō in certain words, and 清 and 請 may also be pronounced shin in certain words. One kanji, 情 may also be pronounced zei, and another kanji, 精 may also be pronounced shō. In other words, in this case, the number of kanji assigned to respective pronunciations are six, two, two, one, and one.

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 251

A phonetic component is likely to be reliable if it is shared by many kanji with an identical pronunciation occurring in the list of everyday-use words for learners. On the other hand, even if a phonetic component occurs in a number of kanji, it would not be useful if few or none of the kanji in which it occurs are pronounced in the same way. For example, the phonetic component 寸 occurs in eight different kanji. However, each of these eight kanji exhibits one or more of the eight possible pronunciations of 寸: tai, son, shu, tō, sun, su, zon, and ju. The numbers of kanji assigned for each of these pronunciations are 2, 2, 2, 1, 1, 1, 1, and 1.

As shown above, the total number of kanji sharing a single component does not, in itself, tell us much about the utility of that particular phonetic component. Therefore, in this study, where there is more than one pronunciation available for a certain component, our focus is on whether there is a multiple number of kanji assigned for any of the pronunciations of the phonetic component. This is because if there are many kanji showing the same pronunciation, the phonetic component is likely to be useful.

Findings Let us now examine each of the factors, both individually, and in relation to the other factors.

Factor 1. Within the range of the everyday-use vocabulary for learners, how applicable is knowledge of the phonetic component in words within which the phonetic component occurs?

The applicability of a phonetic component in words influences the utility of the phonetic component, particularly because learners normally encounter phonetic components in words, not in isolated kanji. The scores (determined by the number of words and the levels of those words in which a phonetic component occurs) range from eight to 404. The highest scored phonetic component is 各 . This phonetic component occurs in 165 words, scoring 404. On the other hand, the phonetic component 夆 only occurs in five words within Matsushita’s list of everyday-use vocabulary for learners, and the total score for the phonetic components in these words is eight. It is not worth learning low scoring phonetic components.

252 Japanese Language and Literature

Word Score No. of PhC

400~ 1 各 350~399 0 300~349 6 生 重 方 己 寺 也 250~299 3 正 中 主 200~249 7 工 分 古 开 同 長 青 150~199 20 士 寸 義 見 艮 里 咅 羊 占 成 咸 周 意 者 兌

交 直 甬 軍 反 100~149 36 原 雚 元 予 令 家 乍 半 台 其 丁 毎 共 少 化

侖 僉 王 且 豆 孝 員 列 東 告 次 官 必 支 舌

白 皮 非 才 市 余 50~99 59 失 尚 圣 复 珍 辰 啇 干 付 祭 昜 充 尺 申 加

冓 害 求 五 丂 莫 責 更 比 谷 氐 良 亡 壮 区

敬 兄 炎 几 束 甫 九 真 果 亥 止 奇 亢 旨 及

扁 監 蒦 則 肖 包 𢦏 央 韋 章 召 未 可 曽 ~49 42 采 畐 㕣 戔 司 票 系 既 広 君 凶 爰 因 旦 斉

尃 般 禺 尞 侵 兪 𠔉 兆 巨 毛 并 曷 臤 㐮 茲

垂 喿 辟 兼 奴 呉 曹 賁 岡 朱 夆 麻

Table 1. Phonetic components sorted by word scores

Factor 2. Within the range of the everyday-use vocabulary for learners, how many pronunciations does each phonetic component represent?

The number of pronunciations represented by one phonetic component ranges from one to 12. For example, as mentioned above, the phonetic component 長 is always pronounced chō, regardless of the word in which it occurs. In contrast, the phonetic component 各 has 12 pronunciations, and the appropriate pronunciation needs to be chosen depending on the word in which it occurs.

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 253

No. of Pron.

No. of PhC

1 23 長 義 亢 章 尞 曹 巨 𢦏 冓 㐮 旨 氐 呉 求 兪 麻 五

侵 奴 喿 及 賁 旦 2 59 同 中 里 交 成 官 僉 未 化 可 令 咸 采 亡 啇 司 票

監 則 充 甫 奇 垂 付 夆 壮 禺 申 召 兆 良 曽 原 員

周 𠔉 央 系 敬 真 茲 肖 加 広 岡 因 既 凶 告 侖 孝

毛 君 臤 比 辟 㕣 蒦 兼 3 41 生 士 羊 正 市 其 半 亥 乍 東 也 甬 占 扁 止 方 干

包 責 雚 次 列 咅 意 予 韋 区 辰 軍 丂 斉 祭 炎 戔

害 尚 爰 更 般 毎 并 4 22 白 才 开 家 果 圣 复 丁 莫 共 非 皮 九 見 朱 珍 尃

昜 舌 曷 几 元 5 19 豆 工 青 古 畐 王 且 支 兄 束 主 直 谷 余 艮 尺 失

台 分 6 2 重 者 7 1 必 8 5 己 寺 寸 兌 反 9 1 少

10 0 11 0 12 1 各

Table 2. Phonetic components sorted by the number of pronunciations

Factor 3. Within the range of the everyday-use vocabulary for learners, how reliable is the phonological information suggested by each phonetic component?

For example, in the case of the phonetic component 義 gi, the learner is certain (100%) of successfully pronouncing the kanji with this phonetic component, regardless of the word in which it occurs. Similarly, the phonetic component 亡 is also of considerable reliability, since 98.6 % of the time it is pronounced bō, even though this phonetic component also represents another pronunciation. In contrast, the phonetic component 台 can be pronounced in five different ways depending on the word in which it occurs. The chances of it being successfully pronounced are 30.7 %, 21.9%, 20.4%, 15.3%, and 11.7%, of which the

254 Japanese Language and Literature

highest is only 30.7%. Such a low reliability is not helpful for the learner. In the case of the phonetic component 臤 , two pronunciations are equally likely (50% and 50%).

Reliability Number of

PhC

100 23 長 義 亢 章 尞 曹 巨 𢦏 冓 㐮 旨 氐 呉 求 兪

麻 五 侵 奴 喿 及 賁 旦 90~99 22 里 亡 官 僉 啇 交 甬 同 采 未 司 票 士 中 監

則 化 系 市 央 甫 奇 80~89 26 其 敬 意 孝 付 占 半 夆 扁 果 咅 侖 真 生 壮

害 成 禺 羊 茲 肖 加 原 分 豆 可 70~79 19 申 員 家 止 召 咸 広 兼 区 白 圣 蒦 束 畐 兆

炎 㕣 毎 干 60~69 20 乍 包 也 珍 岡 辰 正 因 東 責 既 凶 尚 反 才

令 良 九 复 工 50~59 31 支 臤 丂 兄 告 周 寸 𠔉 雚 軍 青 次 古 毛 曽

己 爰 韋 王 君 斉 辟 祭 比 方 亥 几 曷 重 共

40~49 19 少 开 谷 非 見 主 者 予 列 垂 朱 充 并 般 舌

寺 尃 更 且 30~39 12 必 失 余 兌 艮 元 戔 台 各 丁 直 尺

20~29 2 莫 昜 10~19 0

0~9 0

Table 3. Phonetic components sorted by reliability

Factor 4. Within the range of the everyday-use vocabulary for learners, how many kanji share a particular pronunciation of a certain phonetic component?

According to our database, the number of kanji sharing a common phonetic component with a common pronunciation ranges from one to nine. The phonetic component with the highest number of kanji sharing the same pronunciation is 工. This phonetic component occurs in nine kanji that are pronounced kō, which is one of the five possible

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 255

pronunciations in Matsushita’s list of everyday-use vocabulary for learners. This means that the learner can use the knowledge that the phonetic component 工 is pronounced kō in those nine kanji. Even though this phonetic component four other pronunciations, having one pronunciation shared by many kanji enhances the utility of the phonetic component. On the other hand, a phonetic component is not worth learning if there is only one kanji for each of many pronunciations. For example, the phonetic component 見 is shared by four kanji, but there is only one kanji for each of the four pronunciations. Likewise, in the case of phonetic component 寸, even though it occurs in eight kanji, only one or two kanji are assigned for each of the pronunciations, which limits the usefulness of this phonetic component.

No. of kanji

No. of PhC

9 1 工 8 1 方 7 1 古 6 3 可 青 且 5 8 生 僉 召 白 己 止 包 皮 4 35 義 同 中 交 㐮 氐 兪 麻 士 羊 正 反 官 令 采 亡

啇 付 申 兆 良 曽 半 豆 寺 圣 干 复 次 丁 各 分

咅 者 莫 3 60 長 里 成 亢 章 尞 曹 巨 𢦏 冓 旨 呉 求 五 侵 奴

喿 及 未 化 司 票 監 則 賁 甫 奇 旦 夆 壮 禺 市

其 家 乍 東 才 开 𠔉 扁 果 兼 畐 責 雚 王 少 列

意 比 辟 非 兄 韋 辰 主 余 艮 亥 台 2 61 咸 原 員 周 也 甬 占 央 系 敬 真 茲 肖 加 広 岡

因 既 凶 告 侖 孝 毎 重 寸 毛 君 臤 支 共 予 必

㕣 蒦 区 束 珍 九 直 垂 充 丂 斉 祭 谷 朱 并 尃

戔 昜 害 尚 舌 兌 尺 更 般 失 几 元 曷

1 4 軍 見 炎 爰

Table 4. Phonetic components sorted by the number of kanji

Each of the factors affects the utility of a phonetic component, and none can be a determining factor on its own, since the effects of these

256 Japanese Language and Literature

factors are interwoven. If a phonetic component appears in only a few technical words, it is not worth learning. A certain phonetic component may seem more useful if it occurs in many widely-used frequently-appearing words. However, this condition itself may not ensure the utility of the phonetic component. It would be confusing for learners if the phonetic component represents a number of different pronunciations. If a phonetic component occurs in frequently used words, and has only a few pronunciations, it may be said that the utility of the phonetic component is high. The utility is higher if the phonetic component has one most probable pronunciation, and lower if it does not have any highly probable pronunciation. If a phonetic component occurs in commonly used words, has only a few pronunciations, and the reliability of one of the pronunciations is high, the phonetic component appears to be very promising. This phonetic component is likely to be very useful if, on top of these above conditions, there are multiple kanji representing the same pronunciation. Therefore, criteria for selecting useful phonetic components must reflect all the above factors. Ranking To a certain extent, we have filtered the phonetic components to remove those that are unqualified (e.g., the ones that occur only in one or two kanji) prior to creating the database. Since the goal of this study is to find useful phonetic components, the remaining 174 phonetic components are further screened as follows.

In order to rank the phonetic components in terms of utility, we classify each phonetic component into High, Medium, and Low for each of the factors. If distributed evenly among these three levels, since the database contains 174 phonetic components, each level would have about 60 phonetic components. However, in reality, such an even distribution pattern is not apparent, as detailed below. Please note that the following criteria should be taken as one example, and should be altered according to the needs of learners/teachers.

Classification Score of words Number of phonetic components High 100 ~ 76

Medium 50 ~ 99 57 Low ~ 49 41

Table 5. Factor 1 criteria: The number and score of words where the phonetic component occurs3

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 257

Classification Number of pronunciations

Number of phonetic components

High 1 23 Medium 2 59

Low 3 ~ 92

Table 6. Factor 2 criteria: The number of pronunciations represented by a phonetic component4

Classification Pronunciation reliability

Number of phonetic components

High 80 ~ 71 Medium 50 ~ 79 69

Low ~ 49 34

Table 7. Factor 3 criteria: The reliability of pronunciation suggested by a phonetic component5

Classification Number of kanji Number of phonetic components

High 3 ~ 109 Medium 2 61

Low 1 4

Table 8. Factor 4 criteria: The number of kanji sharing a common phonetic component and a common pronunciation6

Using the above criteria, first, each phonetic component is typed with a combination of H, M, and L (e.g., HMHL) according to evaluation of the four factors. Altogether, 35 types are identified. Secondly, the types of phonetic components are grouped according to their utility (see Table 9).

The types of phonetic components with a combination of just Hs and Ms, and no Ls, form the top group. In addition to these types, two more types that have three Hs and one L are included in this group. In total, 11 types and 69 phonetic components are thus identified as very useful phonetic components. The second group consists of the ones with one or two Hs and only one L. There are nine types and 50 phonetic components in this group. These are considered to be a group of

258 Japanese Language and Literature

relatively useful phonetic components. The third group has the components with one or two Hs and two or three Ls, or all Ms. Seven types and 23 phonetic components are in this group. Caution is needed when applying the knowledge of these phonetic components in reading

Reliability and Usefulness Stars F1 F2 F3 F4

No. of

PhC PhCs

*** H H H H 2 長 義

*** M H H H 9 亢 章 𢦏 冓 旨 氐

求 五 及

*** H M H H 8 官 僉 化 里 同 中

成 交

*** M M H H 10 亡 啇 監 則 甫 奇

付 壮 未 可

*** H M H M 3 原 侖 孝

*** M M H M 5 央 敬 真 肖 加

*** H M M H 1 令

*** M M M H 5 申 召 良 曽 比

*** H M M M 4 咸 員 周 告

*** H L H H 10 羊 生 士 市 其 半

豆 分 咅 意

Hig

h

All H

s an

d M

s an

d 3

Hs

and

1L

*** L H H H 12 㐮 尞 喿 兪 奴 曹

侵 呉 巨 麻 賁 旦

** H L H M 2 甬 占

** H L M H 18 正 反 方 青 工 己

古 白 家 乍 東 才

雚 次 王 开 少 皮

** H L M M 6 也 毎 寸 支 共 重

** M L H H 2 扁 果

** M L M H 10 止 圣 干 包 責 复

兄 韋 辰 亥

** M L H M 1 害

** L M H H 5 夆 禺 票 司 采

** L M H M 2 茲 系

Med

ium

1 or

2 H

s an

d 1

L

** L M M H 4 兆 𠔉 兼 辟

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 259

* M M M M 2 蒦 充

* H L M L 1 軍

* H L L H 10 寺 列 丁 且 各 者

余 艮 非 台

* H L L M 7 予 必 直 舌 元 主

* H L L L 1 見

* M L L H 1 莫

Low

Ms

and

1 or

2 H

s an

d 2

or 3

Ls

* L L M H 1 畐

* M L M M 9 区 束 珍 九 丂 祭

昜 尚 几

* M L M L 1 炎

* M L L M 4 谷 尺 失 更

* L M M M 10 広 岡 因 既 凶 臤

毛 君 㕣 垂

* L M M L 0

* L L M M 3 斉 曷 戔

* L L M L 1 爰

Very

low

No

Hs

* L L L M 4 朱 并 尃 般

174

Table 9. Table of ranking words. The lowest group is comprised of the components that have no Hs. Eight types and 32 phonetic components form this group. These phonetic components are probably not worth learning since their utility is limited in reading Japanese words. Therefore, we recommend 119 phonetic components (69 very useful ones and an additional 50 relatively useful ones) to be learned by, or taught to, learners of the Japanese language.

At a glance, it may appear that many of the high-ranking phonetic components presented in this list are unlikely to be encountered by beginner or intermediate students, and that some of the medium- to low-ranking components may be more familiar to them. This is because the current list of phonetic components is for general learners who encounter words in various texts in various domains of discourse during an entire period of their study of Japanese, from beginner to advanced levels.

260 Japanese Language and Literature

Some of the high-ranked phonetic components are not immediately useful for low proficiency learners. Rather they are useful for long-term learners in deriving or retrieving the pronunciations of unfamiliar or poorly memorized kanji when reading kanji words. Therefore, learners need to be reminded constantly and repeatedly, every time they encounter a word containing kanji with a useful component, that those phonetic components are and will be reliable and applicable over time. Showing other words containing the target phonetic components, even if they are not part of their target learning words at that particular time, may be one effective approach. Comparison Based on the analysis of the four factors, we have created a list of phonetic components highlighting 119 useful components. Although comparing the results of the current study with the results of the previous studies (i.e., Kano 1993, Townsend 2011) may be worthwhile, a rigid statistical comparison between the current list and the lists in the previous studies is not feasible due to the different number of samples and the different criteria used for ranking. Both Kano (1993) and Townsend (2011) analyzed and ranked phonetic components according to their utility measured by only one factor, that is the number of kanji with a common phonetic component showing a common pronunciation or similar pronunciations. However, their results are not identical probably because Kano analyzed the 1,945 old-version everyday-use kanji whereas Townsend analyzed 2,230 kanji (the 1,945 kanji plus the 285 kanji used for people’s names), and also because they both used their own subjective criteria for judging the usefulness of the phonetic components that have “similar” pronunciations. In the present study, we analyzed (at the word level) the 2,136 kanji appearing in the current list for everyday-use, and also in Matsushita’s list of everyday-use vocabulary for learners, while taking the above four factors into consideration. Another difference is that, in the current study, we did not make a similarity judgment (i.e., we treated non-identical pronunciations as different pronunciations) in order to avoid making any subjective judgments.

Nevertheless, just by placing the three lists side by side, it becomes clear that the utility measured at the word level is different from that measured at the kanji level (see Appendix 2). By having incorporated the word level information, it has been revealed that some phonetic components that have previously been considered to be highly useful are

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 261

actually less so. For example, the phonetic component 莫 has been considered a very reliable phonetic component as it occurs in five kanji that have a common pronunciation bo. Despite this positive aspect, this phonetic component has too many downsides to be considered useful. This phonetic component represents four pronunciations, bo, mo, baku and maku, and none of these pronunciations can be called the main pronunciation, as the reliabilities of these individual pronunciations are more or less the same (20–30%). Moreover, there are not many words with this phonetic component. In the case of 朱 , another phonetic component that was identified as a highly useful phonetic component in previous studies (i.e., Kano 1993, Townsend 2011), the current word level study reveals that one of the kanji listed in their lists, 株, is never pronounced in its on-yomi, shu, in Matsushita’s list of everyday-use vocabulary for learners. Moreover, this phonetic component has many possible pronunciations, none of which are dominant, and occurs in few words, all factors which contributed to decreasing the utility of this particular phonetic component.

Thus, the current list, which has excluded the phonetic components that are likely to cause confusion, should be beneficial for learners who wish to make use of phonetic components in deriving or retrieving the pronunciations of unfamiliar or poorly memorized kanji when reading kanji words. Also, our list of phonetic components should be more effective than the previous lists for Japanese language educators who wish to incorporate phonetic components in their vocabulary instruction. Conclusion In this study, we examined the utility of phonetic components at the word level, ranked them according to the degree of utility, and compared these results with those of previous studies. We argued that there were four essential factors influencing the utility of phonetic components at the word level, beyond the total number of kanji in which a phonetic component occurs at the kanji level. They were, within the range of the learner everyday-use vocabulary (1) How applicable is knowledge of the phonetic component in words within which the phonetic component occurs?, (2) How many pronunciations does the phonetic component represent?, (3) How reliable is the phonological information suggested by the phonetic component?, and (4) How many kanji share the particular pronunciation of the phonetic component? These four factors were used to analyze, classify and rank phonetic components. Based on

262 Japanese Language and Literature

the results of the analysis, we recommend 119 phonetic components (69 very useful ones and additional 50 relatively useful ones) to be learned by, or taught to, learners of the Japanese language. We believe that the findings of this study have provided some valuable information to learners and instructors of Japanese as a second/foreign language, as well as to researchers whose interests lie in applied linguistics and psycho-linguistics.

Nevertheless, this attempt to rank phonetic components in terms of utility is by no means perfect. Further research is needed in order to address the following points: It remains unclear as to whether phonetic components that can be a stand-alone kanji are more useful than those that cannot. For native speakers who already have knowledge of the everyday-use kanji, those that are also used as kanji must be more readily accessible and therefore useful due to the familiarity effect. However, for learners of kanji, whether this makes a difference is not known. In relation to this issue in regard to phonetic components that can be kanji in their own right, a further question arises as to whether the components that have the same pronunciations, either as kanji or a phonetic component, are more useful than those that have different pronunciations for each form. This seems to be the case for native speakers (Masuda and Saito 2002, Saito et al. 1998). However, for learners who do not have prior knowledge, this aspect may be irrelevant. These issues need to be further investigated, not only at the kanji level, but also at the word level. Another issue is how we should treat “similar pronunciations.” It has been claimed that the pronunciations of kanji sharing a common phonetic component often rhyme (Jackson, Lu, and Ju 1994). However, determining whether such rhymes result in a perceptibly similar pronunciation is not an easy task. Kaiser (1996) investigated whether learners from different countries perceive various pronunciations in a similar manner. The results of Kaiser’s study showed that, although there was a slight level of consensus among the learners, differences between different language speakers were quite large, which suggested that there are no absolutes in regard to perceptibly similar pronunciations. At present, there is no previous study addressing the effect of similar pronunciations on the use of phonetic components. As such, more research is called for in order to improve the current list of phonetic components.

Along with the improvement of the database, it is also important to investigate whether explicit instruction about phonetic components has a

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 263

measurable effect on learners. A caveat is required here. Most probably, short-term explicit instruction targeting learners at the beginner or intermediate levels would not produce reliable results. Longitudinal studies following a group of students from beginner to advanced levels are required in order to obtain trustworthy results. This is because the selected phonetic components are identified as being useful based on the words which general learners (who have a wide range of interests) are expected to encounter in their years of Japanese language learning. For dedicated learners who are in a particular domain, domain-specific lists of phonetic components may be necessary. Outside of Japan, where there may be more short-term learners (e.g., ones who only attain an intermediate level of proficiency) than long-terms learners, we may need to make separate lists of useful phonetic components suitable for learners at different levels of proficiency.

We are now in the Asian Century, which may parallel with the American Century (the twentieth century) and the British Century (the nineteenth century). In 2012, for instance, the Australian government released a report entitled “Australia in the Asian Century,” which places great emphasis on the efficient teaching/learning of Asian languages, including Japanese. In order to assist learners to become proficient users of kanji, which is known as the most difficult aspect of the Japanese language, in a limited period of time, a more strategic approach in kanji teaching is required. One approach may be the use of the phonological information of phonetic components for reading kanji words. We believe that our database, with its comprehensive information, and our list of phonetic components, with its focus on useful phonetic components, will be useful for both research and learning/teaching of kanji words.7

Notes This work was supported by JSPS Grant-in-Aid for Scientific Research (B), Grant Number: 23320102. We would like to express our sincere gratitude to Tatsu Matsushita, who responded to our numerous questions about his database. We thank Dallas Nesbitt and anonymous reviewers who gave us very constructive and helpful comments on early versions of this manuscript.

264 Japanese Language and Literature

1 Quoted from Reviewing the Kanji Forum http://forum.koohii.com/viewtopic.

php?id=764. 2 See http://www.wa.commufa.jp/~tatsum/English%20top_Tatsu. html. 3 Because the database contains 174 phonetic components, approximately 60

phonetic components are allocated to each level, and then the lines are drawn at the scores 100 and 50 for convenience.

4 For this factor, unlike the other three factors, the smaller the number, the higher the rank of the phonetic component. Phonetic components with only one pronunciation are separated from the rest although the number of such phonetic components is much smaller (23) than those with multiple pronunciations. Having only one pronunciation means that the learner can apply the knowledge to all words containing the phonetic component.

5 For convenience, the lines are drawn at 80 and 50. 6 For our database, only the phonetic components occurring in three or more

kanji within the current 2,136 everyday-use kanji have been included; the phonetic components that occur in only one or two kanji were not included. This has automatically reduced the number of phonetic components with only one kanji exhibiting only one pronunciation in the above table. Despite the very small number, a separate row is provided for the phonetic components with only one kanji per pronunciation. They need to be separated from the others because the relationship between the kanji and the pronunciation is not transferable, and the learner cannot use the knowledge with any other kanji.

7 Contact the first author, Etsuko Toyoda, for the full database.

No. of words in each level of difficulty

For each kanji

For each pron.

For each PhC

PhC No.

PhC

PhC

Pronunciation

Pronunciation

Kanji

B I

A H

S

No.

Sc

ore

No.

Sc

ore

No.

Sc

ore

No. of pron.

Reliability of main pron.

No. of kanji per pron. Score

Pronunciation

Reliability

# Kanji

1 長

ョウ

ョウ

4

7 2 1

22

12

0 66

16

7 79

20

4 79

20

4 1

100

3 H

H

H

H

1

ョウ

0

1 1

3 1

0 6

14

79

204

79

204

1 10

0 3

1 長

ョウ

0 4

2 0

1 0

7 23

79

20

4 79

20

4 1

100

3

:

:

:

:

:

:

:

:

72

セイ

0

1 5

8 5

0 19

40

52

11

0 88

20

0 5

55

6 H

L

M

H

72

0 0

3 2

6 1

12

19

52

110

88

200

5 55

6

72

0 1

3 3

3 1

11

22

52

110

88

200

5 55

6

72

0 3

0 0

1 0

4 13

52

11

0 88

20

0 5

55

6

72

0 1

2 1

0 0

4 12

52

11

0 88

20

0 5

55

6

72

0 0

0 2

0 0

2 4

52

110

88

200

5 55

6

72

ョウ

ョウ

0

0 0

0 1

0 1

1 31

77

88

20

0 5

39

2

72

ョウ

0

10

6 5

8 1

30

76

31

77

88

200

5 39

2

72

シン

0

0 2

0 0

0 2

6 3

8 88

20

0 5

4 2

72

0 0

0 1

0 0

1 2

3 8

88

200

5 4

2

72

ゼイ

0

0 1

0 0

0 1

3 1

3 88

20

0 5

2 1

72

ョウ

ョウ

0

0 0

1 0

0 1

2 1

2 88

20

0 5

1 1

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 265

Appendix 1

266 Japanese Language and Literature

Appendix 2

PhC

Toyo

da,

et

al.

Kano

Tow

nsen

d

PhC

#

PhC

Toyo

da,

et

al.

Kano

Tow

nsen

d

1 長 *** *** *** 88 加 *** * N 2 義 *** *** *** 89 広 * * N 3 同 *** *** *** 90 岡 * ** N 4 中 *** *** *** 91 因 * * N 5 里 *** ** N 92 既 * * N 6 交 *** *** *** 93 凶 * * N 7 成 *** ** *** 94 告 *** * N 8 亢 *** ** N 95 侖 *** * N 9 章 *** ** N 96 孝 *** * *** 10 尞 *** ** N 97 扁 ** ** *** 11 曹 *** * *** 98 果 ** ** *** 12 巨 *** ** ** 99 止 ** ** N 13 𢦏 *** N N 100 兼 ** ** N 14 冓 *** *** N 101 圣 ** *** *** 15 㐮 *** *** N 102 畐 * ** *** 16 旨 *** ** N 103 干 ** *** *** 17 氐 *** *** *** 104 包 ** *** *** 18 呉 *** ** N 105 責 ** ** N 19 求 *** ** N 106 复 ** *** *** 20 兪 *** *** N 107 雚 ** ** N 21 麻 *** *** N 108 次 ** *** *** 22 五 *** ** N 109 王 ** * N 23 侵 *** * N 110 少 ** ** ** 24 奴 *** ** N 111 列 * ** N 25 喿 *** *** *** 112 丁 * *** ** 26 及 *** *** *** 113 且 * *** *** 27 生 *** *** *** 114 各 * *** *** 28 士 *** *** *** 115 毎 ** * ** 29 羊 *** *** ** 116 重 ** * N 30 官 *** *** *** 117 寸 ** * N

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 267

31 僉 *** *** *** 118 分 *** ** *** 32 未 *** ** N 119 咅 *** *** N 33 化 *** *** *** 120 意 *** * N 34 可 *** *** *** 121 毛 * * N 35 令 *** *** *** 122 君 * * N 36 亡 *** *** *** 123 臤 * * N 37 啇 *** *** *** 124 比 *** * N 38 監 *** ** N 125 辟 ** * *** 39 則 *** ** *** 126 支 ** ** N 40 奇 *** ** *** 127 共 ** ** *** 41 付 *** *** *** 128 予 * * ** 42 壮 *** ** N 129 非 * ** *** 43 曽 *** *** *** 130 皮 ** ** *** 44 原 *** * N 131 必 * * ** 45 市 *** ** N 132 蒦 * * N 46 其 *** *** *** 133 兄 ** ** N 47 半 *** ** *** 134 韋 ** ** N 48 豆 *** *** N 135 区 * * ** 49 正 ** ** *** 136 束 * * N 50 反 ** *** *** 137 珍 * N *** 51 咸 *** * N 138 九 * * N 52 采 ** ** *** 139 辰 ** *** *** 53 司 ** *** *** 140 軍 * * * 54 票 ** ** N 141 丂 * ** N 55 賁 *** ** N 142 祭 * * N 56 甫 *** *** *** 143 害 ** * N 57 旦 *** ** *** 144 㕣 * * N 58 夆 ** N *** 145 主 * *** * 59 禺 ** ** *** 146 者 * *** ** 60 申 *** ** *** 147 見 * * *** 61 召 *** *** *** 148 直 * * *** 62 兆 ** *** ** 149 垂 * ** N 63 良 *** *** *** 150 充 * * N 64 員 *** * N 151 斉 * ** N 65 周 *** * ** 152 谷 * * *** 66 家 ** ** N 153 朱 * *** *** 67 白 ** *** *** 154 并 * * N

268 Japanese Language and Literature

68 乍 ** *** *** 155 尃 * ** N 69 東 ** ** N 156 炎 * N N 70 才 ** ** ** 157 戔 * ** *** 71 工 ** *** *** 158 昜 * * N 72 青 ** *** *** 159 尚 * N *** 73 古 ** ** *** 160 舌 * * N 74 己 ** *** *** 161 余 * * N 75 开 ** ** N 162 艮 * *** *** 76 寺 * *** *** 163 兌 * N * 77 方 ** *** *** 164 尺 * * N 78 也 ** * * 165 爰 * N N 79 甬 ** * *** 166 更 * * N 80 占 ** * * 167 般 * * N 81 𠔉 ** N *** 168 曷 * ** N 82 央 *** * N 169 亥 ** * ** 83 系 ** * *** 170 失 * * N 84 敬 *** * N 171 几 * ** *** 85 真 *** * N 172 元 * * ** 86 茲 ** ** N 173 台 * ** ** 87 肖 *** ** *** 174 莫 * *** ***

Note: N means that the PhC is not part of the list.

References

Bunkachō. 2010. Jōyō kanji hyō (List of everyday-use kanji). http:// www.bunka.go.jp/kokugo_nihongo/kokujikunrei_h221130.html,accessed on March 5, 2012.

Chen, Xi., Hua Shu, Ningning Wu, and Richard Anderson. 2003. Stages in Learning to Pronounce Chinese Characters. Psychology in the Schools 40: 115–124.

Chigusa Shinichi. 2008. Jōyō kanji no ninchi mojiron teki kōsatsu (Study of character recognition: Everyday-use kanji). Ruikeigaku kenkyū

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 269

(Typology research) 2: 141–183.

Chikamatsu Nobuko. 1996. The Effects of L1 Orthography on L2 Word Recognition: A Study of American and Chinese Learners of Japanese. Studies in Second Language Acquisition 18: 403–432.

Den Keijiro. 1999. Kanji ryoku—raku kan soku kiokuhō (Kanji ability—easy, simple, fast memorization method). Tokyo: Taiyō shuppan.

Flores d’Arcais, Giovanni B., Hirofumi Saito, and Masahiro Kawakami. 1995. Phonological and Semantic Activation in Reading kanji Characters. Journal of Experimental Psychology: Learning, Memory, and Cognition 21(1): 34–42.

Geva, Esther., and Min Wang. 2001. The Development of Basic Reading Skills in Children: A Cross-Language Perspective. Annual Review of Applied Linguistics 21: 182–204.

Heisig, James W. 1987. Remembering the Kanji 2. Honolulu: University of Hawai’i Press.

Ishizawa Seiji. 2012. Onpu jun jōyō kanji gakushū jiten (Jōyō kanji study dictionary in the order of phonetic components). Osaka: Ishizawa shoten.

Jackson, N. E., W.-H. Lu, and D. Ju. 1994. Reading Chinese and Reading English: Similarities, Differences, and Second-Language Reading. The Varieties of Orthographic Knowledge—Theoretical and Developmental Issues, ed. by V. W. Berninger, 73–110. Dordrecht, Netherlands: Kluwer Academic Publishers.

Kaiho H., and Y. Nomura. 1983. Kanji jōhō shori no shinrigaku (The psychology of kanji information processing). Tokyo: Kyōiku shuppan.

Kaiser, Stefan. 1996. Kanji gakushū to on’in (Rhyme in kanji learning). Journal of Japanese Language Teaching International Student Center, University of Tsukuba, 11: 99–112.

Kano Chieko. 1993. Kanji no zōji seibun ni kan-suru ichi-kōsatsu (Study on basic Japanese components of kanji) (2). Bungei gengo kenkyū (Studies in language and literature) 24: 97–114.

Kano Yoshimitsu. 1998. Jōyō kanji mirakuru masutā jiten (Jōyō kanji

270 Japanese Language and Literature

miracle master dictionary). Tokyo: Shōgakkan.

Koda Keiko. 1995. Cognitive Consequences of L1 and L2 Orthographies. Scripts and Literacy: Reading and Learning to Read Alphabets, Syllabaries and Characters, ed. by I. Taylor and D. R. Olson, 311–326. Dordrecht, Netherlands: Kluwer Academic Publishers.

———. 1999. Role of Intraword Awareness in kanji Knowledge Development. Paper presented at the AILA World Congress 1999, Tokyo, Japan.

———. 2002. Writing Systems and Learning to Read in a Second Language. Chinese Children’s Reading Acquisition, ed. by W. Li, J. S. Gaffney, and J. L. Packard, 225–248. Boston: Kluwer Academic Publishers.

Leong, Che Kan, and Tamaoka Katsuo. 1995. Use of Phonological Information in Processing kanji and katakana by Skilled and Less Skilled Japanese Readers. Reading and Writing: An Interdisciplinary Journal 7: 377–393.

Masuda Hisashi and Saito Hirofumi. 2002. Interactive Processing of Phonological Information in Reading Japanese kanji Character Words and Their Phonetic Radicals. Brain and Language 81: 445–453.

Matsushita Tatsuhiko. 2011.日本語を勉強する人のための語彙 データベ

ース Version 1.0 (留学生用)(Vocabulary database for learners of Japanese, Version 1.0) http://tatsuma2010.web.fc2. com/, accessed on December 10, 2011.

———. 2012. In What Order Should Learners Learn Japanese Vocabulary? A Corpus-Based Approach. Ph.D. dissertation, Victoria University of Wellington.

Miyashita Hisao. 1991. Kanji ga tanoshiku naru hon 4 (The book that makes you enjoy kanji 4). Tokyo: Tarō Jirō Sha.

———. 2000. Wakareba mitsukaru shitteru kanji (If you understand this, you can find the kanji you know). Tokyo: Tarō Jirō Sha.

Mori Yoshiko. 1998. Effects of First Language and Phonological Accessibility on kanji Recognition. The Modern Language Journal 82 (1), 69–81.

Etsuko Toyoda, Arief Muhammad Firdaus, and Chieko Kano 271

———. 1999. Beliefs about Language Learning and Their Relationship to the Ability to Integrate Information from Word Parts and Context in Interpreting Novel kanji Words. Modern Language Journal 83: 534–547.

Noguchi, Mary S. 1995. Component Analysis of kanji for Learners from Non-kanji Using Countries. The Language Teacher 19 (10). http://www.kanjiclinic.com/langteacherca.htm, accessed on February 5, 2012.

Nomura Masaaki. 1981. Jōyōkanji no onkun (Statistics of the on-kun readings of jōyō kanji). Keiryō kokugogaku (Mathematical linguistics) 13(1): 27–33.

Pye, Michael. 1971. The Study of Kanji. Tokyo: Hokuseidō Press.

Reviewing the Kanji Forum http://forum.koohii.com/viewtopic.php?id= 764, accessed on February 12, 2012.

Saito, H., H. Masuda, and M. Kawakami. 1998. Form and Sound Similarity Effects in kanji Recognition. Reading and Writing: An Interdisciplinary Journal 10 (3/5): 323–357.

———. 1999. Subword Activation in Reading Single kanji Character Words. Brain and Language 68: 75–81.

Shirakawa Shizuka. 2007. Jitō (The system of characters). Tokyo: Heibonsha.

Shu, Hua and Richard Anderson. 1998. Learning to Read Chinese: The Development of Metalinguistic Awareness. Reading Chinese Script: A Cognitive Analysis, ed. by J. Wang, A. Inhoff, and H. Chen, 1–18. Mahwah, N.J.: Lawrence Erlbaum Associates.

Taft, Marcus and Xiaoping Zhu. 1995. The Representation of Bound Morphemes in the Lexicon: A Chinese Study. Morphological Aspects of Language Processing, ed. by L. B. Feldman, 293–316. Hove, U.K. and Hillsdale, N.J.: Lawrence Erlbaum Associates.

———. 1997. Submorphemic Processing in Reading Chinese. Journal of Experimental Psychology: Learning, Memory, and Cognition 23: 761–775.

Tamaoka Katsuo. 1991. Psycholinguistic Nature of the Japanese

272 Japanese Language and Literature

Orthography. Studies in Language and Literature 11 (1): 49–82.

———. 2003. Where Do Statistically-Derived Indicators and Human Strategies Meet When Identifying on- and kun-readings of Japanese kanji? Cognitive Studies 10 (4): 441–468.

Tamaoka Katsuo and Yamada Hiroyuki. 2000. The Effects of Stroke Order and Radicals on the Knowledge of Japanese kanji Orthography, Phonology and Semantics. Psychologia Society 43 (3): 199–210.

Townsend, Hiroko. 2011. Phonetic Components in Japanese Characters. Masters Thesis submitted to the Faculty of Arts, San Diego State University.

Toyoda Etsuko. 2001. English-Speaking Learners’ Use of Component Information in Processing Unfamiliar kanji. Australian Review of Applied Linguistics 23 (1): 1–14.

———. 2007. Enhancing Autonomous L2 Vocabulary Learning for Focusing on the Development of Word-Level Processing Skills. Reading Matrix 7 (3): 13–22.

———. 2009. Learning to Read Non-Alphabetic Script. Saarbricken, Germany: VDM Verlag.

Wang, W. S.-Y. 1981. Language Structure and Optimal Orthography. Perception of Print: Reading Research in Experimental Psychology, ed. by O. J. L. Tzeng and H. Singer, 223–236. Hillsdale, N.J.: Lawrence Erlbaum Associates.

Xing, Janet Zhiqun. 2006. Teaching and Learning Chinese as a Foreign Language. Hong Kong: Hong Kong University Press.

Yamamoto Yasutaka. 2007. Kanji onpu jiten (Kanji phonetic component dictionary). Osaka: Ado Poporo.

Yamashita Hiroko and Maru Yukiko. 2000. Compositional Features of kanji for Effective Instruction. Journal of the Association of Teachers of Japanese 34 (2): 159–178.