Psychoanalysis, speech acts and the language of “free speech
Automatic Speech Recognition in Bengali Language & Statistical Machine Translation
Transcript of Automatic Speech Recognition in Bengali Language & Statistical Machine Translation
Bengali Automatic Speech Recognition & English-Bengali
Statistical Machine Translation
Tapabrata Mondal Computer Science & Engineering Dept
Jadavpur University Kolkata 700032
In SEECAT 2013 Copenhagen Business School
Our Problem
1> Bengali Automatic Speech Recognition System ( Context Independent & Context Dependent )
2>A English To Bengali Statistical Machine Translation System
Motivation
1> National Language of Bangladesh 2> spoken in West Bengal , Assam , Tripura,Jharkhand in India 3> second most popular language in India 4> 7th rank position in the world regarding amount of native speakers 5> originated from Sanskrit .
Motivation
1> important Language in India 2> Devnagari Language Family
Some Projects in this language 1> English To Indian Language Machine Translation Project ( ANUVADAKSH )2>Cross Lingual Information Access ( SANDHAN ) 3>Indian Language To Indian Language Machine Translation ( SAMPARK )
Related Work
Some Speech Recognition Systems
1> SHPINX by CMU2>BYBLOS by BBN 3>work done by Lincoln Lab , SRI , MIT and AT&T Bell Labs
Bengali Speech Recognition System
● Shruti -- > Bengali Speech Recognition System ● SPHINX3-based Bengali Automatic Speech
Recognition(ASR) system Shruti-II ● an E-mail application based on it. ● This ASR system converts standard Bengali
continuous speech to Bengali Unicode.● Developed by Indian Institute Of Technology ,
Kharagpur .
Automatic Speech Recognition ASR
1> Context Independent ASR Model 2> Context Dependent ASR Model
for Bengali Language
Context Independent Model ---> Mono Phone Model Context Dependent -----> Tri Phone Model
ASR
● Context Independent Model
target phone does not depend on previous or next phones --- > Mono Phone Model
● Context Dependent Model
target phone depends on previous and next phones ----> Tri Phone Model
Automatic Speech Recognition ASR
Data Base used in this ASR task :
Shruti Bangla Speech Corpus
1>22 hours speech corpus in Bengali Language 2> Developed by Society for Natural Language Technology and Research with the help of Indian
Institute Of Technology , Kharagpur
Automatic Speech Recognition ASR
Shruti Bangla Speech Corpus
1>Each audio file contains one sentence with its transcription in a separate file .
2>It has a pronunciation dictionary with a Grapheme to Phoneme mapping .
Automatic Speech Recognition ASR
Shruti Bangla Speech Corpus
1>7383 sentences
2> 49 phonemes
3>22012 words
Not Sufficient to build up a ASR Language Model for Bengali Language .
Automatic Speech Recognition ASR
ITRans 5 NotationAmi satyi satyii oi jagadIsha jeThamAlAni bhadralokera kono khabarA`` khabara` rAkhi nA
UTF-8 Bengali encoding
আিম সতিয সতিযি িও জগদীশ েজঠমালািন ভদরললোলকর েকোলনো খবরা`` খবর` রািখ না
Grapheme To Phoneme Model
Grapheme Phoneme
সরবলমোট s aa rr b oh m oh T
দুঃখ dd u k kh oh
িবষনুবাবুর b i sh n u b A b u rr
িভিসিড bh i s i D i
িভিসিড bh i s i D i
Automatic Speech Recognition
Context Independent ASR Model for Bengali Language in the Second week of SEECAT 2013 project .
mono phone model .
Moses .
Next aim ---> Context Dependent Model for Bengali ASR .
Moses
● “Moses is a statistical machine translation system
● allows automatically train translation models for any language pair.
● collection of translated texts (parallel corpus). ● An efficient search algorithm finds quickly the
highest probability translation among the exponential number of choices”
.
Automatic Speech Recognition
● A Context Dependent Model in Summer House ● Larger Language Model ● Moses ● Grapheme To Phoneme Model
Automatic Speech Recognition ASR
we have clustered Bengali phonemes into clusters like Vowels , Consonants , Guttural , Palatal , Monopthong , Dipthong , Labial , Retroflex , Voiced , UnVoiced , Aspirated , Unaspirated , Velar , Palatal , Nasal stops , Dental , Labial , Approximants , Sibilants , Glottal .
Bengali Phoneme Clustering
Example : ● sil1 sil2 sil3 sil4 // silences● a oi rr aa u // vowels● a aa // vowel guttural s_monopthong● oi // vowel palatal s_monopthong● u //vowel labial l_monopthong● r // vowel retroflex s_monopthong● k kh g gh ch chh j jh ng T Th D Dh tt tth dd ddh
n p f b bh m // consonants●
Bengali Phoneme Clustering
● k //velar unvoiced_unaspirated● kh g //velar voiced_unaspirated● gh //velar voiced_aspirated● ch //palatal unvoiced_unaspirated● chh //palatal unvoiced_aspirated● j //palatal voiced_unaspirated● jh //palatal voiced_aspirated● ^n //palatal nasal_stops●
Bengali Phoneme Clustering
● T //retroflex unvoiced_unaspirated● Th //retroflex unvoiced_aspirated● D //retroflex voiced_unaspirated● Dh //retroflex voiced_aspirated● tt //dental unvoiced_unaspirated● tth //dental unvoiced_aspirated● dd //dental voiced_unaspirated● ddh //dental voiced_aspirated
Bengali Phoneme Clustering
● n //dental nasal_stops● p //labial unvoiced_unaspirated● f //labial unvoiced_aspirated● b //labial voiced_unaspirated● bh //labial voiced_aspirated● m //labial nasal_stops● y rr l // approximants● sh sh s //sibilants● h //glottal●
Mapping Bengali Phoneme To English Phoneme
● We have mapped Bengali Phonemes to English Phonemes .
Bengali Phoneme English Phoneme● A aa● ^A aa,2 aa,3 ng,2● aa aa● ^aa aa,2 aa,3 ng,2● b b ● bh b,2 b,3 h,2 ●
Mapping Of Bengali Phoneme to English Phoneme
● Bengali Phoneme English Phoneme● ch ch● chh ch,2 ch,3 h,2 ● D d● dd d● ddh dh● Dh dh
etc ....
Language Model
● The Large Language Model from Tourism & Health Domain , Cleaned Corpus .
● 6 lakh words and 41 K sentences .● 91 K types ( Unique words ) from the Data
base . ● Grapheme To Phoneme Model ● Moses
Automatic Speech Recognition ASR
Evaluation of Context Dependent ASR Model .
1>215 out of Domain Sentences recorded in my Voice
2>57 Sentences of Domain Data in my voice
Hypothesis files produced & evaluated ..
English To Bengali SMT
Aim ----> English To Bengali Statistical Machine Translation Model using Moses .
English To Bengali SMT
65 K parallel sentences form Tourism & Health Domain and from others domains .
Cleaned Corpus for SMT
Moses running using these sentences .
Future Work
1> Improvement of ASR model using a larger Language Model of sentences from various domains
2>Improvement of SMT Model for English To Bengali Languages