Automatic Speech Recognition in Bengali Language & Statistical Machine Translation

29
Bengali Automatic Speech Recognition & English-Bengali Statistical Machine Translation Tapabrata Mondal Computer Science & Engineering Dept Jadavpur University Kolkata 700032 In SEECAT 2013 Copenhagen Business School

Transcript of Automatic Speech Recognition in Bengali Language & Statistical Machine Translation

Bengali Automatic Speech Recognition & English-Bengali

Statistical Machine Translation

Tapabrata Mondal Computer Science & Engineering Dept

Jadavpur University Kolkata 700032

In SEECAT 2013 Copenhagen Business School

Our Problem

1> Bengali Automatic Speech Recognition System ( Context Independent & Context Dependent )

2>A English To Bengali Statistical Machine Translation System

Motivation

1> National Language of Bangladesh 2> spoken in West Bengal , Assam , Tripura,Jharkhand in India 3> second most popular language in India 4> 7th rank position in the world regarding amount of native speakers 5> originated from Sanskrit .

Motivation

1> important Language in India 2> Devnagari Language Family

Some Projects in this language 1> English To Indian Language Machine Translation Project ( ANUVADAKSH )2>Cross Lingual Information Access ( SANDHAN ) 3>Indian Language To Indian Language Machine Translation ( SAMPARK )

Related Work

Some Speech Recognition Systems

1> SHPINX by CMU2>BYBLOS by BBN 3>work done by Lincoln Lab , SRI , MIT and AT&T Bell Labs

Bengali Speech Recognition System

● Shruti -- > Bengali Speech Recognition System ● SPHINX3-based Bengali Automatic Speech

Recognition(ASR) system Shruti-II ● an E-mail application based on it. ● This ASR system converts standard Bengali

continuous speech to Bengali Unicode.● Developed by Indian Institute Of Technology ,

Kharagpur .

Automatic Speech Recognition ASR

1> Context Independent ASR Model 2> Context Dependent ASR Model

for Bengali Language

Context Independent Model ---> Mono Phone Model Context Dependent -----> Tri Phone Model

ASR

● Context Independent Model

target phone does not depend on previous or next phones --- > Mono Phone Model

● Context Dependent Model

target phone depends on previous and next phones ----> Tri Phone Model

Automatic Speech Recognition ASR

Data Base used in this ASR task :

Shruti Bangla Speech Corpus

1>22 hours speech corpus in Bengali Language 2> Developed by Society for Natural Language Technology and Research with the help of Indian

Institute Of Technology , Kharagpur

Automatic Speech Recognition ASR

Shruti Bangla Speech Corpus

1>Each audio file contains one sentence with its transcription in a separate file .

2>It has a pronunciation dictionary with a Grapheme to Phoneme mapping .

Automatic Speech Recognition ASR

Shruti Bangla Speech Corpus

1>7383 sentences

2> 49 phonemes

3>22012 words

Not Sufficient to build up a ASR Language Model for Bengali Language .

Automatic Speech Recognition ASR

ITRans 5 NotationAmi satyi satyii oi jagadIsha jeThamAlAni bhadralokera kono khabarA`` khabara` rAkhi nA

UTF-8 Bengali encoding

আিম সতিয সতিযি িও জগদীশ েজঠমালািন ভদরললোলকর েকোলনো খবরা`` খবর` রািখ না

Grapheme To Phoneme Model

Grapheme Phoneme

সরবলমোট s aa rr b oh m oh T

দুঃখ dd u k kh oh

িবষনুবাবুর b i sh n u b A b u rr

িভিসিড bh i s i D i

িভিসিড bh i s i D i

Automatic Speech Recognition

Context Independent ASR Model for Bengali Language in the Second week of SEECAT 2013 project .

mono phone model .

Moses .

Next aim ---> Context Dependent Model for Bengali ASR .

Moses

● “Moses is a statistical machine translation system

● allows automatically train translation models for any language pair.

● collection of translated texts (parallel corpus). ● An efficient search algorithm finds quickly the

highest probability translation among the exponential number of choices”

.

Automatic Speech Recognition

● A Context Dependent Model in Summer House ● Larger Language Model ● Moses ● Grapheme To Phoneme Model

Automatic Speech Recognition ASR

we have clustered Bengali phonemes into clusters like Vowels , Consonants , Guttural , Palatal , Monopthong , Dipthong , Labial , Retroflex , Voiced , UnVoiced , Aspirated , Unaspirated , Velar , Palatal , Nasal stops , Dental , Labial , Approximants , Sibilants , Glottal .

Bengali Phoneme Clustering

Example : ● sil1 sil2 sil3 sil4 // silences● a oi rr aa u // vowels● a aa // vowel guttural s_monopthong● oi // vowel palatal s_monopthong● u //vowel labial l_monopthong● r // vowel retroflex s_monopthong● k kh g gh ch chh j jh ng T Th D Dh tt tth dd ddh

n p f b bh m // consonants●

Bengali Phoneme Clustering

● k //velar unvoiced_unaspirated● kh g //velar voiced_unaspirated● gh //velar voiced_aspirated● ch //palatal unvoiced_unaspirated● chh //palatal unvoiced_aspirated● j //palatal voiced_unaspirated● jh //palatal voiced_aspirated● ^n //palatal nasal_stops●

Bengali Phoneme Clustering

● T //retroflex unvoiced_unaspirated● Th //retroflex unvoiced_aspirated● D //retroflex voiced_unaspirated● Dh //retroflex voiced_aspirated● tt //dental unvoiced_unaspirated● tth //dental unvoiced_aspirated● dd //dental voiced_unaspirated● ddh //dental voiced_aspirated

Bengali Phoneme Clustering

● n //dental nasal_stops● p //labial unvoiced_unaspirated● f //labial unvoiced_aspirated● b //labial voiced_unaspirated● bh //labial voiced_aspirated● m //labial nasal_stops● y rr l // approximants● sh sh s //sibilants● h //glottal●

Mapping Bengali Phoneme To English Phoneme

● We have mapped Bengali Phonemes to English Phonemes .

Bengali Phoneme English Phoneme● A aa● ^A aa,2 aa,3 ng,2● aa aa● ^aa aa,2 aa,3 ng,2● b b ● bh b,2 b,3 h,2 ●

Mapping Of Bengali Phoneme to English Phoneme

● Bengali Phoneme English Phoneme● ch ch● chh ch,2 ch,3 h,2 ● D d● dd d● ddh dh● Dh dh

etc ....

Language Model

● The Large Language Model from Tourism & Health Domain , Cleaned Corpus .

● 6 lakh words and 41 K sentences .● 91 K types ( Unique words ) from the Data

base . ● Grapheme To Phoneme Model ● Moses

Automatic Speech Recognition ASR

Evaluation of Context Dependent ASR Model .

1>215 out of Domain Sentences recorded in my Voice

2>57 Sentences of Domain Data in my voice

Hypothesis files produced & evaluated ..

English To Bengali SMT

Aim ----> English To Bengali Statistical Machine Translation Model using Moses .

English To Bengali SMT

65 K parallel sentences form Tourism & Health Domain and from others domains .

Cleaned Corpus for SMT

Moses running using these sentences .

Future Work

1> Improvement of ASR model using a larger Language Model of sentences from various domains

2>Improvement of SMT Model for English To Bengali Languages

Thanks

Questions !!