Magic Mirror

8
Magic Mirror Jun-Ren Ding 1 , Chien-Lin Huang 2 , Jin-Kun Lin 1 , Jar-Ferr Yang 1 and Chung-Hsien Wu 2 Institute of Computer and Communication Engineering, Department of Electrical Engineering 1 , Department of Computer Science and Information Engineering 2 National Cheng Kung University, Tainan, Taiwan [email protected] Abstract This investigation describes a novel design and implementation of an interactive multimedia mirror system, called “Magic Mirror.” The Magic Mirror can be easily implemented in existing personal computers or hand-held device with normal peripherals and regular reflective glass by integrating image/speech processing, Internet connectivity, and 3D and multimedia software. The integrated Magic Mirror, which includes speech recognition, speech synthesis, face detection/modified/recognition, 3D virtual genius, hidden LCD mirror, and camera, performs simple syndication to capture information about peripherals and network connections. The user can easily activate personal multimedia services using verbal commands. The Magic Mirror can function like a good friend who listens to the user’s questions and automatically responds to these requests, providing relaxation and consolation. Moreover, the Magic Mirror can detect a user’s feeling based on speech and image recognition features to select the appropriate music and speech to alter the user’s mood. 1. Introduction Smart home designs to improve the comfort, convenience, and security of homes are becoming increasingly important in information communication technology (ICT) to enable new user-friendly services. Based on pervasive and ubiquitous computing, many investigations have successfully integrated diverse human computer interface technologies to implement advanced smart living products [1]–[7]. In several investigations, smart homes have been developed by combining monitor and mirror systems [2], [3]. While such studies have highlighted the potential of smart homes, their extensive applications for the future have not yet been demonstrated. Although [2] utilized a projector and charge coupled device (CCD) camera to implement a Magic Mirror, the proposed system has three major limitations. First, it is impossible to implement such applications in a normal house owing to the large space requirement. Second, an overlapped image is formed between the projector and mirror. Finally, the projector performs poorly in a relatively bright environment due to the influence of reflected light. In [3], a Magic Mirror was developed for application in a bathroom; its features include user detection, display method, and content function. Acrylic was utilized instead of glass for this mirror. However, acrylic degrades easily by scraping, strong light reflection, and heating. In addition, CCD camera and infrared sensors need to be installed behind the mirror. Moreover, placing expensive radio frequency identification (RFID) device in a toothbrush seems like a rather excessive move; toothbrushes are typically not carried to other places. Conversely, if all the members of a household place all their toothbrushes together in one bathroom, the personal information service is rendered redundant. This investigation considers the relational factor of cost and actuality to develop a Magic Mirror for applications wherein it is useful. The application of the Magic Mirror is not restricted to indoor environments such as a bathroom, living room, office, or shopping store. We adopt many different techniques to develop an effective Magic Mirror for practical environments. The remainder of this paper is organized as follows. Section II describes the design concept of the Magic Mirror. Hardware components of the Magic Mirror are also described in detail in this section. Section III describes video and speech signal processing techniques for the design of a human-machine interface (HMI) to enable the interactivity of the Magic Mirror. The conclusions are presented in Section IV along with recommendations for future research. 2. Hardware and Functionality Design The story of “Snow White” features a Magic Mirror that knows everything and provides updated information to the user (i.e., the Queen) in an interactive manner. This investigation describes the design of an interactive multimedia mirror system, called the Magic Mirror, providing speech recognition and synthesis, 3-dimensional Ninth IEEE International Symposium on Multimedia 2007 0-7695-3058-3/07 $25.00 © 2007 IEEE DOI 10.1109/ISM.2007.11 176

Transcript of Magic Mirror

Magic Mirror

Jun-Ren Ding1, Chien-Lin Huang2, Jin-Kun Lin1, Jar-Ferr Yang1 and Chung-Hsien Wu2 Institute of Computer and Communication Engineering, Department of Electrical Engineering1,

Department of Computer Science and Information Engineering2 National Cheng Kung University, Tainan, Taiwan

[email protected]

Abstract

This investigation describes a novel design and implementation of an interactive multimedia mirror system, called “Magic Mirror.” The Magic Mirror can be easily implemented in existing personal computers or hand-held device with normal peripherals and regular reflective glass by integrating image/speech processing, Internet connectivity, and 3D and multimedia software. The integrated Magic Mirror, which includes speech recognition, speech synthesis, face detection/modified/recognition, 3D virtual genius, hidden LCD mirror, and camera, performs simple syndication to capture information about peripherals and network connections. The user can easily activate personal multimedia services using verbal commands. The Magic Mirror can function like a good friend who listens to the user’s questions and automatically responds to these requests, providing relaxation and consolation. Moreover, the Magic Mirror can detect a user’s feeling based on speech and image recognition features to select the appropriate music and speech to alter the user’s mood. 1. Introduction

Smart home designs to improve the comfort, convenience, and security of homes are becoming increasingly important in information communication technology (ICT) to enable new user-friendly services. Based on pervasive and ubiquitous computing, many investigations have successfully integrated diverse human computer interface technologies to implement advanced smart living products [1]–[7]. In several investigations, smart homes have been developed by combining monitor and mirror systems [2], [3]. While such studies have highlighted the potential of smart homes, their extensive applications for the future have not yet been demonstrated. Although [2] utilized a projector and charge coupled device (CCD) camera to implement a Magic Mirror, the proposed system has three major limitations. First, it is impossible to implement such applications in a normal house owing to the large space

requirement. Second, an overlapped image is formed between the projector and mirror. Finally, the projector performs poorly in a relatively bright environment due to the influence of reflected light. In [3], a Magic Mirror was developed for application in a bathroom; its features include user detection, display method, and content function. Acrylic was utilized instead of glass for this mirror. However, acrylic degrades easily by scraping, strong light reflection, and heating. In addition, CCD camera and infrared sensors need to be installed behind the mirror. Moreover, placing expensive radio frequency identification (RFID) device in a toothbrush seems like a rather excessive move; toothbrushes are typically not carried to other places. Conversely, if all the members of a household place all their toothbrushes together in one bathroom, the personal information service is rendered redundant.

This investigation considers the relational factor of cost and actuality to develop a Magic Mirror for applications wherein it is useful. The application of the Magic Mirror is not restricted to indoor environments such as a bathroom, living room, office, or shopping store. We adopt many different techniques to develop an effective Magic Mirror for practical environments. The remainder of this paper is organized as follows. Section II describes the design concept of the Magic Mirror. Hardware components of the Magic Mirror are also described in detail in this section. Section III describes video and speech signal processing techniques for the design of a human-machine interface (HMI) to enable the interactivity of the Magic Mirror. The conclusions are presented in Section IV along with recommendations for future research.

2. Hardware and Functionality Design

The story of “Snow White” features a Magic Mirror that knows everything and provides updated information to the user (i.e., the Queen) in an interactive manner. This investigation describes the design of an interactive multimedia mirror system, called the Magic Mirror, providing speech recognition and synthesis, 3-dimensional

Ninth IEEE International Symposium on Multimedia 2007

0-7695-3058-3/07 $25.00 © 2007 IEEEDOI 10.1109/ISM.2007.11

176

Reflective glass

Magic mirror state : displayer on

Mirror state : displayer off

Fig. 1. Characteristics of the reflective glass.

(3D) graphic generation, Internet connectivity, and multimedia services. With regard to the hardware system, the Magic Mirror includes a microphone and a pair of loud speakers for speech interaction and audio. It also includes an liquid crystal display (LCD) display and camera, which are covered by a plate of reflective glass, for human interaction and video. As shown in Fig. 1(a), the reflective glass has the following characteristics [8]. If the luminance in front of the glass is greater than that behind it, the lighter side exhibits the attributes of a mirror, while the darker side is transparent. Once the reflective glass is installed in front of the LCD panel, the Magic Mirror, as shown in Fig. 1(b), acts like a real mirror if the LCD display is switched off. In contrast, the Magic Mirror displays an image directly, as shown in Fig. 1(c), when the LCD display is switched on. In addition to the reflective mirror, heat insulation paper and polyethylene can also be employed to produce similar effects for different application environments. CCD camera and infrared sensors are installed on top of the LCD display and covered by the reflective mirror. As shown in Fig. 2, a CCD camera can easily capture an image without the affection of reflective glass. A microphone can be installed under the Magic Mirror or in any other appropriate location for better speech perception. For greater distances, a personal digital assistant (PDA) or a Bluetooth microphone can be utilized for speech commands and touch panel controls. A Magic Mirror that can be used as a smart lifestyle product should be able to perform as many applications as possible. However, several practical factors should be considered, including hardware cost, computational complexity, inclusive environments, and personal adaptation. Clearly, nobody would want to purchase a Magic Mirror that can only depict a story scenario. The Magic Mirror should be able to retrieve Internet information and provide real-time personal services with seamless control interaction. Fig. 3 shows the function diagram of the proposed Magic Mirror system. First, as depicted in Fig. 4(a), the Magic Mirror switches on automatically and displays the 3D genius if voice energy is detected and the keyword “Magic Mirror” is identified. If these inputs are not detected, it functions only as a regular mirror, as shown in Fig. 4(b). With reference to the story of Snow White, to the question “Who is the most beautiful woman in the world?” the 3D genius responds with “Certainly, it is you,” as illustrated in Fig. 5(a) and (b).

Fig. 2. Images captured by the camera installed behind the

reflective glass.

3D Virtual Intelligent

3D Module

General Mirror

Magic Mirror

Services

Weather

News

Time

Memo

Database

Internet

Music

Movie

Face Modified

Who is the beautiful...

Text-to-Speech

Speech Recognition

MirrorUser

Mobile DeviceInteraction

Fig. 3. Function diagram of the proposed system.

When the Magic Mirror is initiated for a multimedia application, the 3D genius provides general information (time/date/calendar) based on internally stored data and special information (such as weather/stocks/news) from the Internet. The 3D genius can display movies and music by asking for user requests, discriminating between male and female voices, and providing usable functions in the form of icons. Besides these functions is a lip icon labeled “3D genius.” A demo video explaining these functions can be downloaded from the Internet [9] and is illustrated in Figs. 6(a)−(f). Finally, the smart phone is applied to the loud environment, and implemented the further order and look up to Magic Mirror. Table 1 presents an actual application of keywords. 3. Software Technology Image/speech processing technologies have some limitations in all the environments in which they are applied. These limitations need to be addressed in detail. Image and speech processing in smart homes incurs a low cost. Further, it is extremely suitable for personal use and can be extended to recognition and HMI. This section introduces the main technologies, including image/speech processing, Internet, and 3D image and vision reverberation. We only describe some operation steps for relation technology due to space constraints. For further details on the technologies employed, please consult the references.

177

Table 1. Actual application of keywords of the proposed Magic Mirror. Index User command keyword Magic Mirror response speech Action

A "神奇的魔鏡" or "魔鏡~魔鏡" (Magic Mirror)

“歡迎我是魔鏡!” (Welcome, I am Magic Mirror)

Open LCD

B "我想照鏡子!" (I would like to use the mirror!)

無聲 (No message delivered)

Close LCD

C "現在幾點了?" (What time it is now?)

"現在時間為…" (It is now…..)

Database in PC

D "天氣怎麼樣, 外面天氣怎麼樣?" (How is the weather?)

"今天台南天氣為…" (Today, Tainan is …)

Internet Capture

D-1 "那台北呢?" (How about Taipei?)

"今天台北天氣為…" (Currently, Taipei is ….)

Internet Capture

D-2 "那台中呢?" (How about Taichung?)

"今天台中天氣為…" (Today, Taichung is ….)

Internet Capture

D-3 " 那高雄呢?" (How about Kaohsuing?)

"今天高雄天氣為…" (Today, Kaohsuing is ….)

Internet Capture

E "有什麼新聞?" (What is the news?)

"今日頭條新聞有…" (Today’s headlines have ….)

Internet Capture

E-1 "還有呢?" (What else?)

"體育新聞有..." (The sport news is….)

Internet Capture

F "世界上誰最漂亮?" (Who is the fairy of all?)

"當然是妳啦,親愛的主人" (Is you, my deal master!)

Magic Mirror

G "今天有什麼行程?" (Today, what is my schedule?)

"您今天的行程是…" (Your schedule is ….)

Database in PC

H "你會做什麼?" (What can you do for me?)

"我可以告訴你新聞、氣象、行事曆、時間、我很厲害喔" (I can tell you news, weather, schedule, and time! How is that?)

Magic Mirror

I "我想看電影" (I would like to watch movies!)

無聲 (No message delivered)

Open Movie

J "來點音樂吧!" (How about some music!)

無聲 (No message delivered)

Open Music

J-1 "不要放了" (Stop playing!)

無聲 (No message delivered)

Close Movie or Music

K "媽媽在家嗎" (Is mom home?)

"是的, 正在煮飯" (Yes, she is cooking.)

Magic Mirror

L "今天股市如何" (How is the stock market today?)

"您投資的股票為…" (Your invested stock price is …)

Internet Capture

M "看著我" (Look at me!)

無聲 (No message delivered)

Open Webcam

N "不要看我" (Don’t peer!)

無聲 (No message delivered)

Close Webcam

O "聽我的口令" (Listen to my order!)

無聲 (No message delivered)

Close All functions

P "不要偷聽" (Don’t listen!)

無聲 (No message delivered)

Close Microphone besides Index A

(a) Magic Mirror. (b) Regular mirror.

Fig. 4. Recognition of speech command by the Magic Mirror.

(a) Listen to user. (b) 3D genius response.

Fig. 5. Interaction between user and the Magic Mirror.

178

3-A. Speech Recognition and Response Speech signal processing technology is adopted to

control digital multimedia for a smart home and in the Snow White example. The focus is on an automatic speech recognition control that can perform both speech recognition and synthesis [10]−[13]. The automatic speech recognition control is based on the hidden Markov model (HMM). In addition to promoting the current rate of speech recognition and building a good speaker-independent model, it is a popular speech recognition system. Figs. 7 and 8 illustrate the speech recognition and synthesis, respectively. The steps in these processes are as follows: Step 1. Detection of effective voice range: Voice energy is

easily detected by using buffer exchange and detection of the voice range. The Magic Mirror only accepts the correct order. The zero crossing rate (ZCR) detection and speech energy can be estimated as follows:

|)()(|21

11

−=

−= ∑ n

N

nn xsignxsignZCR (1)

where sign(.) is 1 for positive arguments and 0 for negative arguments, and xn denotes the time domain signal for each collected speech buffer. The speech energy is determined by the average root mean square (RMS) energy within the collected speech buffer.

Step 2. Feature Extraction: This feature comprises inputting information for recognition and classifying and training voice data. The mel frequency cepstral coefficients (MFCC) are extracted as a speech signal analysis feature [11].

Step 3. Sex identification: Voice information from users is collected by the Gaussian mixture model (GMM). A

(a) Boy icon for male voice. (b) Multiple functions displayed.

(c) News services. (d) Movie: Shrek 1. (e) Music services.

(f) Facial deformation (diminished lips and enlarged eyes).

Fig. 6. Implemented functions in the proposed Magic Mirror.

Gaussian classifier is a Bayesian classifier in which the class-conditional probability density for each class λi has a Gaussian distribution [14].

11/ 2/2

1 1( | ) exp[ ( ) ( )]2(2 )

ti i i id

i

p x x xλ µ µπ

−= − − ∑ −∑

(2)

Here, µi represents the mean vector; ∑I, the covariance matrix; and d, the feature vector. Therefore, the different services are modified based on the sex of the user.

Step 4. Linguistic decoding: This refers to a speech recognition algorithm for transforming speech to text. By building the HMM voice model and adopting the Viterbi algorithm (VA), the optimal transform process provides good results by obtaining and searching the optimal address.

Step 5. Speech synthesis: This is performed by using text-to-speech (TTS) technology, which converts voice data into speech output. When applied to automatic voice synthesis, TTS synthesizes speech randomly to respond to user requests and while providing a service, thus increasing the degree of communication between the user and the machine.

3-B. Vision response Image processing and 3D animation synthesis were applied to construct an animated human form (the 3D genius) [15]. The correct speaking action [16] was constructed by mimicking the human lip form. In addition, the 3D genius was required to give short and long speeches simultaneously in 3D animation. The best 3D animation can be obtained using the following synthesis algorithm. Step 1. Lip feature abstraction: First, 62 feature parameters of

human lip forms must be obtained from video sequences. For this, videos of human lip form were recorded by a previously defined 40-acoustic model. The various features are then detected automatically with optical flow (2-D Lucas-Kanade Flow), which is defined as

∑∈

−−−−=Ryx

tt yxIyxvyyxuxI,

21 )],()),(),,(([EMin (3)

where It and It-1 present the feature points of a gray image in frame t and t – 1, respectively.

Fig. 7. Research of automatic speech recognition control.

Feature Extraction

Pronunciation Grammar

Tree Search Decoding

Recognized Results

Speaker-Independent

Model

179

Fig. 8. Automatic pronunciation and 3D animation synthesis.

u(x, y) and v(x, y) represent the vertical (y) and horizontal (x) vectors, respectively. Fig. 9 shows the feature detection results of the human lip form.

Step 2. 3D coordinate transform: After extracting the displacement of features in 3D coordinates, 62 feature positions are defined in the 3D model along with the interface, as depicted in Fig. 10. The other features are controlled by near features, and the feature displacements are defined by the total amount of near feature displacement multiplied by each weight.

Step 3. Speech synthesis: The first step in speech synthesis is pre-processing the input text. The corresponding voice is processed by analyzing special symbols and sentence break-offs. The content is then analyzed to distinguish the sentence and convert the text codes into voice codes. The speech in the database is then captured in order to obtain the correct parameters such as pitch, duration, energy, and pause. Therefore, the correct mean of the voice can be represented and displayed by a personal computer, sound card, or speaker.

3-C. Image Processing

Face detection is vulnerable to the effects of location and light when using skin color detection, which causes a lower detection rate. Many investigations use grayscale image information as the parameter for neural networks, pattern matching, and principle component analysis (PCA) [17], [18], and the use of this information yields more accurate results than those obtained by skin color detection. However, human faces can be quickly identified by searching for skin colors in image frames. In order to adapt it to smart home, facial geometry and skin color methods are combined to detect human faces and facial features. Facial features can be quickly identified by detecting human faces using face geometry. Deviations in luminance, face shape, and background complexity do not affect performance. This study

Fig. 9. Feature detection result of lip form.

Fig. 10. Feature controls defined by the 3D lip form.

adopts two 2D bitmaps as the main operators. The first bitmap s(i, j) represents the skin color defined by the ellipse skin model [19]:

=otherwise,0

colorskin )),( ),,((,1),(

jiCrjiCbjis (4)

The second bitmap l(i,j) denotes the light effect of skin color

=otherwise,0),(,1

),(thresholdjiY

jil (5)

and

∑∑ ==

),(),( 1),(

jisjiY

threshold jis

β (6)

where YCbCr represents the color transform coefficients from RGB images, and β = 2 according to statistical analysis. In (5), l(i, j) = 1 signifies that the gray image Y(i, j) has less than average light in the skin color group specified in (6). Consequently, some of the darker pixels in the eyes and lips are set to the skin color range, as illustrated in Fig. 11. Finally, five blocks are set to search a human face. These blocks are defined as eye-left, eye-right, cheek-left, cheek-right, and lips. As shown in Fig. 12, if each block satisfies conditions ((7a)−(7d)) correctly, a human face is identified.

,|),(,|),( AjisAjis righttcheekA

leftcheekA

αα >> −− ∑∑ (7a)

,|),(,|),( AjilAjil rightcheekA

leftcheekA

γγ >> −− ∑∑ (7b)

,0|),(,0|),( >> −− ∑∑ righteyeB

lefteyeB

jisjis (7c)

0|),( >∑ lipC

jis (7d)

In the above equations, α = 0.95 owing to the large skin area on the face, and γ = (1 − α) due to the smooth light area of the cheeks. Parameters A, B, and C are defined as the measures of the cheek, eye, and lip areas, respectively. Fig. 13 illustrates the simulations. Finally, as shown in Fig. 14, facial features can be easily searched by discriminating the blocks for the

180

Fig. 11. Under diverse luminance and light sources.

Eq. (7a)Eq. (7b)

Eq. (7c)Eq. (7d)

Fig. 12. Facial block feature defined for face detection.

eyes, lips, and nose after the human face has been detected by the above stated simple face detection algorithm [20]. As shown in Fig. 15, by obtaining all the facial features according to MPEG-4 Facial Animation Parameters (FAPs) [21], [22], face modification can be implemented in the mirror system. As shown in Fig. 6(f), if the polar coordinates are transformed, the eyes and lips enlarge and diminish in size, respectively, while earrings are formed at the ear lobes.

3-D. Face Recognition

Based on the above facial features, we can implement a fast and simple face recognition method. We only compare the clearer features—lips, eyes, border of the hair, and forehead—for face recognition, as shown in Table 2. Refer to Figs. 2 and 15 for the test faces and the distance of the features, respectively. The minimum model uses the farthest distance of a face from the mirror. In actual applications, faces are usually at a distance of 30 to 50 cm from the mirror. We analyze 10 images at different distances from the mirror to obtain the minimum model. Irrespective of where the user is standing in front of the mirror, the feature distance should converge to the minimum model. The total feature distance (T) for face recognition is defined as

n

T

h

w

/)|)3.1112.3||2.1112.3||1.1112.3||4.812.3||2.81.8||10.314.3(|

|)3.84.8||11.312.3((|

−+−+−+−+−+−

+−+−= (8)

where w and h are the feature width and height, respectively. n is the weight for different distances from the mirror to converge toward a tested minimum model. The user is identified based on the difference in the minimum distance (DMD) as follows:

)min( kk tTuser −= (9)

Fig. 13. Simulation results for face detection.

Fig. 14. Facial features obtained using the binary search method.

where tk represents the minimum model obtained by training 10 images for k family members. We can obtain an 85% rate of face recognition for a regular nuclear family. If we wish to increase this rate, we can implement other facial feature distances. Here, we have adopted a very simple face recognition method based on facial features. This study focuses primarily on system design; this is because several other technologies can be used for image detection, face features, face recognition, voice command detection, etc. Other technologies used are as follows. Really simple syndication (RSS 2.0) is adopted to download information such as news, weather, and stock quotes in extensible markup language (XML) format [23]. Fig. 16 shows an example of weather information. The 3D genius tool can be implemented in Open Graphics Library (OpenGL) [24], [25], and movie and music players can be operated through a program developed by us. 4. Conclusions And Future Works This investigation integrates image/speech processing technology, Internet information, and the characteristics of reflective glass to implement the story of Snow White to realize a smart home. A digital lifestyle application system has been implemented by integrating software and hardware devices. This system is very easy to implement on personal computers or existing devices. The Magic Mirror can provide multimedia, interactivity, and HMI by employing LCD devices to increase consumer interest; moreover, it is extremely inexpensive. Many products have mirror functions, such as Magic Mirror mobile phones and TVs; however, no product combines the mirror function interactively with a TV.

181

Moreover, such products are expensive and limited by size constraints. In this study, this level of interactivity can be realized by using a microphone, speaker, camera, and regular reflective glass. Multiple technologies need to be integrated into a smart home. Improving personalization and optimization are our future projects. Additional technology can also be installed in the Magic Mirror.

Appendix

The Magic Mirror project was awarded the first prize in the National and Southern Taiwan Education Program on Image Display from the Ministry of Education—2006 Topic Application Contest on Image Display [26], [9]. A patent for this system has been filed with the Taiwan Patent (TWPAT) [27].

(a) Simulation results for two face images.

(b) MPEG-4 FAPs

Fig. 15. Facial feature search using MPEG-4 FAPs [23].

Fig. 16. Internet information captured by RSS 2.0.

Table 2. Minimum calculated distance between facial features.

Feature distance user 1 user 2 user 3

Eye width |3.12−3.11| 2.86 2.98 3.74

Lip width |8.4−8.3| 4.98 5.70 6.84

Eye height |3.14−3.10|

0.97 0.97 0.67

Lip height |8.1−8.2| 2.40 2.67 2.87

Eye to lip height |3.12−8.4| 7.76 8.83 9.61

Height |3.12−11.1|

9.27 8.97 8.27

Height |3.12−11.2| 8.80 7.80 6.15

Minimummodel

Eye to border of hair and forehead Height

|3.12−11.3| 8.67 6.63 6.00

References

[1] H. Sukeda, Y. Horry, Y. Maruyama, and T. Hoshino, “Information-Accessing Furniture to Make Our Everyday Lives More Comfortable,” IEEE Trans. on Consumer Electronics, vol. 52, no. 1, pp. 173−178, Feb. 2006.

[2] S. Helal, W. Mann, H. El-Zabadani, J. King, Y. Kaddoura and

E. Jansen, “The Gator Tech Smart House: a programmable pervasive space,” IEEE Computer Magazine, vol. 38, no. 3, pp. 50−60, Mar. 2005.

[3] K. Fujinami, F. Kawsar and T. Nakajima, “Aware Mirror: A

Personalized Display Using a Mirror,” Third International Conference, User Interaction, Pervasive Computing, vol. 3468, pp. 315−332, Munich, Germany, May 2005.

[4] A. Bourka, D. Polemi, D. Koutsouris, “Interoperability

Among Healthcare Organizations Acting as Certification Authorities,” IEEE Trans. on Information Technology in Biomedicine, vol. 7, no. 4, pp. 364−377, December 2003.

[5] E. Kafeza, DKW. Chiu, S. C. Cheung and M. Kafeza, “Alerts

in Mobile Healthcare Applications: Requirements and Pilot Study,” IEEE Trans. on Information Technology in Biomedicine, vol. 8, no. 2, pp. 173−181, Jun. 2004.

3.11 3.8

182

[6] F. Axisa, P. M. Schmitt, C. Gehin, G. Delhomme, E.

McAdams and A. Dittmar, “Flexible Technologies and Smart Clothing for Citizen Medicine, Home Healthcare and Disease Prevention,” IEEE Trans. on Information Technology in Biomedicine, vol. 9, no. 3, pp. 325−336, Sep. 2005.

[7] H. P. Park, S. H. Won, J. B. Lee and S. W. Kim, “Smart

home–digitally engineered domestic life,” Personal and Ubiquitous Computing, vol. 7, no. 3, pp. 189–196, Jul. 2003.

[8] http://www.taiwanglass.com/en/index.html. [9] ftp://140.116.163.181. Username: magicmirror, Password:

magicmirror. [10] Y. J. Chen, C. H. Wu, Y. H. Chiu and H. C. Liao,

“Generation of Robust Phonetic Set and Decision Tree for Mandarin Using Chi-square Testing,” Speech Communication, vol. 38, no. 3−4, pp. 349−364, Nov. 2002.

[11] C. H. Wu and J. H. Chen, “Automatic Generation of

Synthesis Units and Prosodic Information for Chinese Concatenative Synthesis,” Speech Communication, vol. 35, pp. 219−237, Oct. 2001.

[12] C. Lin. Huang and C. H. Wu, “Phone Set Generation Based

on Acoustic and Contextual Analysis for Multilingual Speech Recognition,” in Proc. ICASSP, Apr. 2007.

[13] L. Rabiner and B. Juang, 1993, Fundamentals of Speech

Recognition, Prentice-Hall. [14] R. Bates and M. Ostendorf, "Modeling Pronunciation

Variation In Conversational Speech Using Prosody," ISCA Tutorial and Research Workshop on Pronunciation Modeling and Lexicon Adaptation for Spoken Language, Sep. 2002.

[15] F. Kawakami, S. Morishima, R. Harashima and F.

Yamada, "Construction of 3-D Emotion Space Based on Parameterized Faces," IEEE International Workshop on Robot and Human Communication, pp. 216−221, Jul. 1994.

[16] S. C. Choi, H. Harashima and T. Takebe, "Analysis and Synthesis of Facial Expressions in Knowledge-Based Coding of Facial Image Sequences," IEEE International Conference on ASSP, ICASSP-91,vol. 4, pp. 2737−2740, Apr. 1991.

[17] F. Cardinaux, C. Sanderson, and S. Bengio, “User

Authentication via Adapted Statistical Models of Face Images,” IEEE Trans. on Signal Processing, vol. 54, no. 1, Jan. 2005.

[18] W. Zuo, D. Zhang and K. Wang, “Bidirectional PCA with

Assembled Matrix Distance Metric for Image Recognition,” IEEE Trans. on Systems, Man, and Cybernetics-part B: Cybernetics, vol. 36, no. 4, pp. 863−872, Aug. 2006.

[19] M. H. Yang, D. J. Kriegman and N. Ahuja , “Detecting face

in images: A survey,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 24, no. 1, pp. 34−58, Jan. 2002.

[20] A. Saxena, A. Anand and A. Mukerjee, “Robust Facial

Expression Recognition Using Spatially Localized Geometric Model,” International Conference on Systemics, Cybernetics and Informatics, held in Hyderabad, India during, vol. 1, pp. 124−129. Feb. 2004.

[21] Text for ISO/IEC FDIS Visual, ISO/IEC JTC1/SC29/WG11

N2502, Nov. 1998. [22] I. S. Pandzic and R. Forchheimer, Eds., MPEG-4 Facial

Animation. New York: Wiley, 2002. [23] http://webdesign.about.com/cs/rss/a/aa052603a.htm. [24] http://www.opengl.org/. [25] Z. J Chuang and C. H Wu, “Text-to-Visual Speech Synthesis

for General Objects Using Parameter-Based Lip Models,” IEEE Pacific Rim Conference on Multimedia, vol. 2532, pp.589−597, December, 2002.

[26] http://www.fpd.edu.tw/newsDetail.do?id=939. [27] http://www.twpat.com/Webpat/Default.aspx.

183