Online Urdu Handwritten Character Recognition System ...
-
Upload
khangminh22 -
Category
Documents
-
view
3 -
download
0
Transcript of Online Urdu Handwritten Character Recognition System ...
Online Urdu Handwritten
Character Recognition System
Quara-tul-Ain Safdar
2019
Department of Electrical Engineering
Pakistan Institute of Engineering and Applied Sciences
Nilore, Islamabad 45650, Pakistan
Reviewers and Examiners
Reviewers and Examiners Name, Designation & Address
Foreign Reviewer 1
Dr. Jian Yang,ProfessorDeptt. of Electronic Engineering, Tsinghua University,Bejing, 100084, China
Foreign Reviewer 2
Dr. Choon Ki Ahn,ProfessorRoom 506, Engineering Building, School of ElectricalEngineering, Korea University,Seoul, Korea
Foreign Reviewer 3
Dr. Liangrui Peng,Associate ProfessorDeptt. of Electronic Engineering, Tsinghua University,Beijing, 100084, China
Internal Examiner 1
Dr. Abdul JalilProfessorDeptt. of Electrical Engineering, International IslamicUniversity,Islamabad, Pakistan
Internal Examiner 2
Dr. Mutawarra HussainProfessorDepartment of Computer and Information Sciences,Pakistan Institute of Engineering and Applied Sciences,Islamabad, Pakistan
Internal Examiner 3
Dr. Ijaz Mansoor QuereshiProfessorDeptt. of Electrical Engineering, Air University,Sector E-9, Islamabad
Head of the Department (Name):
Signature with Date:
Certificate of Approval
This is to certify that research work presented in this thesis titled Online Urdu
Handwritten Character Recognition System was conducted by Ms. Quara-
tul-Ain Safdar under the supervision of Dr. Kamran Ullah Khan.
No part of this thesis has been submitted anywhere else for any other degree. This
thesis is submitted to PhD in partial fulfillment of the requirements for the degree
of Doctor of Philosophy in the field of Electrical Engineering.
Student Name: Quara-tul-Ain Safdar Signature:
Examination Committee:
Examiners Name, Designation & Address Signature
Internal Examiner 1Dr. Abdul JalilProfessorDEE, IIU, Islamabad
Internal Examiner 2Dr. Mutawarra HussainProfessorDCIS, PIEAS, Islamabad
Internal Examiner 3Dr. Ijaz Mansoor QuereshiProfessorDEE, Air University, Islamabad
Supervisor Dr. Kamran Ullah KhanPE, DEE, PIEAS, Islamabad
Department Head Dr. Muhammad ArifDCE, DEE PIEAS, Islamabad
Dean Research PIEASDr. Naeem IqbalDCE, DEE PIEAS, Islamabad
Thesis Submission Approval
This is to certify that the work contained in this thesis entitled Online Urdu
Handwritten Character Recognition System was carried out by Quara-tul-
Ain Safdar under my supervision and that in my opinion, it is fully adequate,
in scope and quality, for the degree of PhD Electrical Engineering from Pakistan
Institute of Engineering and Applied Sciences (PIEAS).
Supervisor:
Name: Dr. Kamran Ullah Khan
Date: February 14, 2019
Place: PIEAS, Islamabad
Head, Department of Electrical:
Name: Dr. Muhammad Arif
Date: February 14, 2019
Place: PIEAS, Islamabad
Online Urdu Handwritten
Character Recognition System
Quara-tul-Ain Safdar
Submitted in partial fulfillment of the requirements
for the degree of Ph.D.
2019
Department ofElectrical Engineering
Pakistan Institute of Engineering and Applied Sciences
Nilore, Islamabad 45650, Pakistan
Acknowledgements
At last, I have traveled the (pro)long(ed) thoroughfare of writing PhD thesis. It
seems like walking on a never ending road. It looks like wandering from room to
room hunting for the diamond necklace that is already around the neck but you
are unaware of its presence. However, the whole endeavoring led to a beautiful
destination.
It started from PIEAS, in Nilore. Well, right before the beginning began,
the Higher Education Commission of Pakistan advertised the Indigenous Scholar-
ship, I applied, got selected, my parents encouraged and made me reach at PIEAS.
Let me take you there for a stroll.
Clear blue sky, picturesque hills, wild greenery, twittering birds, and tran-
quillity.... it is PIEAS! (There are jackals, oxen and pigs too, but do not look at
them). ‘Pleasant’, ‘Cooperative’, ‘Nice’, ‘very Nice’... three persons, four words!
they made the long lasting impression.
In the beginning, I was afraid of the ‘Giants of knowledge’ in the depart-
ment of Electrical Engineering, and my PhD supervisor is one of them. Fortu-
nately, they were kind enough to teach me the skills they learnt through out their
lives. And they were sensible enough to make me realize the difficulties coming
along the journey. You know, the most benevolent Allah favored me with a high
quality man as PhD supervisor. With very clear concepts and deep knowledge, my
PhD supervisor polished my learning skills and illuminated my research-avenue
with his intellectual proficiency. He never refused to answer my questions. As
a human being, giving respect to my space, he guided me deciding between ‘ap-
propriate and inappropriate’, and ‘right and wrong’. Like my parents, he always
made me decide independently. It is him who made me how to acknowledge the
things worth to be acknowledged.
Along the way, there were many faces; some turned into well-wishers, some
into friends, and a few into family. There were (uncountable) helping hands as
well and shoulders to whom I could rest on. I may not forget the ‘Golden Girls’ of
session 2009-2011 who filled the blanks with valuable moments. I will remember
ii
the ‘Caring Agglomerates’ of session 2012-2014 for the respect I was endowed with.
And then there were especial ones! The days we had walking, talking, laughing and
laughing, and once again laughing with hurting jaws. The messages of “bhookun
laggiun veryun shadeedun” (I am very very hungry) for lunch and dinners. The
prathas we literally made together and that birthday cake too. The arguments,
counter-arguments, counter counter-arguments, and never seconded each other’s
opinion yet still sang (screamed is a true word for our singing) the songs together.
This is all the love I am holding on forever.
The road was getting longer than usual and time was getting harder on me
because I had got stuck somewhere on the track of publishing my research work.
Mornings met up with the evenings, the evenings transformed into nights, and
the given time was getting over. But the ‘courage’ did not lost me because of the
prayers. The prayers and true support of my family, friends and well-wishers, my
diamonds!, never left me in the darkness of disappointment. Rather my home-
trips always made me fresh and more energetic to carry on the journey in forward.
I owe them all.
From the starting block to the finish line many ups and downs has been
passed. At this moment, I am thankful for the nights that turned into morn-
ings; I am thankful for the friends who turned into family; I am thankful for the
family who turned into absolute prayers and never-ending source of courage and
determination; and I am thankful for the dreams that turned into reality.
The people who deserve to be thanked in the most are the tax-payers of my
country. Because these are them who paved the way for an ordinary girl to take
the course of her dreams. These are them who helped me finding the diamonds of
my necklace. Thanks, sir/madam; all the rest is mute.
I am thankful to the Higher Education Commission of Pakistan (HEC) for
providing scholarship under the Indigenous PhD 5000 Fellowship Program (Phase-
V) for my PhD studies.
iii
Author’s Declaration
I, Quara-tul-Ain Safdar hereby declare that my PhD Thesis Titled Online
Urdu Handwritten Character Recognition System is my own work and
has not been submitted previously by me or anybody else for taking any degree
from Pakistan Institute of Engineering and Applied Sciences (PIEAS) or any other
university / institute in the country / world. At any time if my statement is found
to be incorrect (even after my graduation), the university has the right to withdraw
my PhD degree.
(Quara-tul-Ain Safdar)
February 14, 2019
PIEAS, Islamabad.
iv
Plagiarism Undertaking
I, Quara-tul-Ain Safdar solemnly declare that research work presented in the
thesis titled Online Urdu Handwritten Character Recognition System is
solely my research work with no significant contribution from any other person.
Small contribution / help wherever taken has been duly acknowledged or referred
and that complete thesis has been written by me.
I understand the zero tolerance policy of the HEC and Pakistan Institute of En-
gineering and Applied Sciences (PIEAS) towards plagiarism. Therefore, I as an
author of the thesis titled above declare that no portion of my thesis has been
plagiarized and any material used as reference is properly referred / cited.
I undertake that if I am found guilty of any formal plagiarism in the thesis titled
above even after the award of my PhD degree, PIEAS reserves the rights to with-
draw / revoke my PhD degree and that HEC and PIEAS has the right to publish
my name on the HEC / PIEAS Website on which name of students are placed
who submitted plagiarized thesis.
(Quara-tul-Ain Safdar)
February 14, 2019
PIEAS, Islamabad.
v
Copyright Statement
The entire contents of this thesis entitled Online Urdu Handwritten Char-
acter Recognition System was carried out by Quara-tul-Ain Safdar are an
intellectual property of Pakistan Institute of Engineering and Applied Sciences
(PIEAS). No portion of the thesis should be reproduced without obtaining ex-
plicit permission from PIEAS.
vi
Contents
Acknowledgements ii
Author’s Declaration iv
Copyright Statement vi
Contents vii
List of Figures xi
List of Tables xv
Abstract xxi
List of Publications and Patents xxii
List of Abbreviations and Symbols xxiii
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Place of Handwriting in Digital Age . . . . . . . . . . . . . . . . . . 4
1.3 Word Processing Software . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Integrating Handwriting with Technology . . . . . . . . . . . . . . . 6
1.5 Difficulties Involved in Handwriting Recognition . . . . . . . . . . . 7
1.6 Online and Offline Handwriting Recognition . . . . . . . . . . . . . 9
1.6.1 Dynamic Information acquired through Online Hardware . . 10
1.6.2 Advantages of Online Handwriting Recognition over the Of-
fline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.6.3 Available Handwriting Recognition Software . . . . . . . . . 12
1.7 Problem Statement: Online Handwritten Urdu Character Recognition 14
vii
1.8 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.9 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.10 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.11 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 Urdu 22
2.1 Urdu Character-Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.1 Urdu Diacritics . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.2 Single and Multi-Stroke Characters in Urdu . . . . . . . . . 24
2.1.3 Word-Breakdown Structure in Urdu . . . . . . . . . . . . . . 24
2.1.4 Half-Forms of Urdu Alphabets . . . . . . . . . . . . . . . . . 25
2.2 Urdu Fonts: Where do these Half-Forms come from? . . . . . . . . 27
2.2.1 The Nastalique Font . . . . . . . . . . . . . . . . . . . . . . 28
2.2.1.1 Characteristics of Nastalique Font . . . . . . . . . . 31
2.3 Idiosyncrasies of Urdu-Writing . . . . . . . . . . . . . . . . . . . . . 33
3 System Description 38
3.1 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1.1 GUI: Writing Canvas . . . . . . . . . . . . . . . . . . . . . . 38
3.1.2 Information in Handwritten Character-Signal . . . . . . . . 39
3.1.3 About the Data . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1.4 Instructions for writing . . . . . . . . . . . . . . . . . . . . . 42
3.2 Character Database . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.1 Handwritten Samples . . . . . . . . . . . . . . . . . . . . . . 44
3.3 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.1 Re-Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.2 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4 Pre-Classification 50
4.1 Pre-Classification of Half-Forms . . . . . . . . . . . . . . . . . . . . 50
4.2 Results of Pre-Classification . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Further Reflections of Pre-Classification . . . . . . . . . . . . . . . . 56
5 Features Extraction 58
viii
5.1 Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.1.1 Daubechies Wavelets . . . . . . . . . . . . . . . . . . . . . . 63
5.1.2 Discrimination Power of Wavelets . . . . . . . . . . . . . . . 63
5.1.3 Biorthogonal Wavelets . . . . . . . . . . . . . . . . . . . . . 65
5.1.4 Discrete Meyer Wavelets . . . . . . . . . . . . . . . . . . . . 65
5.2 Structural Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3 Sensory Input Values . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6 Final Classification 72
6.1 Final Classifiers with Pre-Classification . . . . . . . . . . . . . . . . 72
6.2 Final Classifiers without Pre-Classification . . . . . . . . . . . . . . 74
6.3 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . 74
6.4 Support Vector Machines (SVMs) . . . . . . . . . . . . . . . . . . . 75
6.5 Recurrent Neural Networks: Long Short-Term Memory . . . . . . . 77
6.6 Deep Belief Network . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.7 AutoEncoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.8 Results and Discussions . . . . . . . . . . . . . . . . . . . . . . . . 82
6.9 Results with Pre-Classification . . . . . . . . . . . . . . . . . . . . . 82
6.10 Results without Pre-Classification . . . . . . . . . . . . . . . . . . . 82
6.11 Maximum Recognition Rate . . . . . . . . . . . . . . . . . . . . . . 85
6.11.1 Overall Accuracy . . . . . . . . . . . . . . . . . . . . . . . . 86
6.11.2 Polling or Subset-wise Accuracy . . . . . . . . . . . . . . . . 87
6.11.3 Character-wise Accuracy . . . . . . . . . . . . . . . . . . . . 87
6.12 Error Analysis using Confusion Matrices . . . . . . . . . . . . . . . 87
6.12.1 Confusing Characters . . . . . . . . . . . . . . . . . . . . . . 91
7 Conclusion 95
7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Appendices 98
Appendix A Confusion Matrices 99
A.1 Confusion Matrices of Support Vector Classifier with db2 -Wavelet-
Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
ix
A.2 Confusion Matrices of Support Vector Classifier with Sensory Input
Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Appendix B Handwritten Urdu Character Samples 115
References 122
x
List of Figures
Figure 1.1 Ancient symbols for alphabets [7] . . . . . . . . . . . . . . . 2
Figure 2.1 Urdu alphabets (fundamental) . . . . . . . . . . . . . . . . . 23
Figure 2.2 Alphabets added to fundamental Urdu alphabets to cope
with phonetic peculiarities . . . . . . . . . . . . . . . . . . . 23
Figure 2.3 Examples of Urdu (fundamental) alphabets with major and
(none, one-, two-, or three-) minor strokes . . . . . . . . . . 25
Figure 2.4 Constructing the Urdu-words . . . . . . . . . . . . . . . . . 26
Figure 2.5 All Urdu characters in all half-forms. . . . . . . . . . . . . . 26
Figure 2.6 Single- and multi-strokes half-forms of Urdu Characters . . . 28
Figure 2.7 Examples of words composed of half-form characters. . . . . 29
Figure 2.8 Examples of words composed from (segmented) handwritten
half-form characters . . . . . . . . . . . . . . . . . . . . . . 29
Figure 2.9 Different Urdu fonts . . . . . . . . . . . . . . . . . . . . . . 30
Figure 2.10 Context dependency . . . . . . . . . . . . . . . . . . . . . . 33
Figure 2.11 Distinct features of Nastalique font . . . . . . . . . . . . . . 34
Figure 2.12 Idiosyncrasies of Urdu writing emphasizing ligature overlap,
writing directions, and placement of diacritics . . . . . . . . 35
Figure 2.13 Idiosyncrasies of Urdu writing emphasizing presence and
characteristics of loops in different writing styles . . . . . . 36
Figure 2.14 Idiosyncrasies of Urdu writing emphasizing presence or ab-
sence of loop in the same character penned by different hands 36
Figure 3.1 Block diagram of the proposed Online Urdu character recog-
nition system: from data acquisition to preprocessing to pre-
classification to feature extraction to final classification. . . . 39
Figure 3.2 Writing interface for digitizing tablet . . . . . . . . . . . . . 40
xi
Figure 3.3 An Urdu word is written on the canvas with the help of a
stylus and tablet . . . . . . . . . . . . . . . . . . . . . . . . 41
Figure 3.4 Examples of handwritten character using stylus and digitiz-
ing tablet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Figure 3.5 Examples of handwritten character using stylus and digitiz-
ing tablet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Figure 3.6 Examples of handwritten character using stylus and digitiz-
ing tablet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Figure 3.7 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet . . 46
Figure 3.8 Re-sampling and Down-sampling of characters . . . . . . . . 48
Figure 3.9 Smoothing of Urdu Handwritten Samples . . . . . . . . . . . 49
Figure 4.1 Pre-classification of initial half-forms on the basis of stroke
count, position and shape of diacritics . . . . . . . . . . . . . 51
Figure 4.2 Pre-classification of medial half-forms on the basis of stroke
count, position and shape of diacritics . . . . . . . . . . . . . 52
Figure 4.3 Pre-classification of terminal half-forms on the basis of
stroke count, position and shape of diacritics . . . . . . . . . 53
Figure 5.1 Wavelet coefficients for ‘sheen’ and ‘zwad’ . . . . . . . . . . 59
Figure 5.2 Wavelet coefficients for different Urdu characters in their
half-forms. Top row shows the character, and x(t) and y(t)
of its major stroke. Second and third rows show level-2
db2 wavelet approximation, and level-4 db2 wavelet detail
coefficients of x(t) and y(t) respectively . . . . . . . . . . . . 61
Figure 5.3 Wavelet coefficients for ‘Tay’ . . . . . . . . . . . . . . . . . . 62
Figure 5.4 Wavelet coefficients for ‘kaafI’ . . . . . . . . . . . . . . . . . 64
Figure 5.5 Wavelet coefficients for ‘hamza’ . . . . . . . . . . . . . . . . 66
Figure 5.6 Wavelet coefficients for ‘ghain’ and ‘fay’ . . . . . . . . . . . . 68
Figure 5.7 Wavelet coefficients for ‘daal’ and ‘wao’. Due to flow of
writing by the users ‘daal’ has included a loop which makes
its wavelets transform similar to ‘wao’ . . . . . . . . . . . . . 69
xii
Figure 5.8 Wavelet coefficients for three different handwritten samples
of ‘meem’ in initial form. Top row shows the character
‘meem’ in initial form written differently by different users,
and x(t) and y(t) of its major stroke. Second and third
rows show level-2 db2 wavelet approximation, and level-4
db2 wavelet detail coefficients of x(t) and y(t) respectively . 70
Figure 5.9 Wavelet coefficients for Urdu character ‘hay ’ in their half-
forms. Top row shows character ‘hay ’, and x(t) and y(t)
of its major stroke. Second and third rows show level-2
db2 wavelet approximation, and level-4 db2 wavelet detail
coefficients of x(t) and y(t) respectively . . . . . . . . . . . . 71
Figure 6.1 Multi-layer perceptron neural network . . . . . . . . . . . . 75
Figure 6.2 Bidirectional multi-layer recurrent neural network . . . . . . 77
Figure 6.3 A simple recurrent neural network. Along solid edges acti-
vation is passed as in feed-forward network. Along dashed
edges a source node at each time t is connected to a target
node at each following time t+1 . . . . . . . . . . . . . . . . 78
Figure 6.4 Confusing pair of ‘fay ’ and ‘ghain’ in medial forms . . . . . 93
Figure 6.5 Confusing pair of ‘Tay ’ and ‘hamza’ in medial forms . . . . 94
Figure 6.6 Confusing pair of ‘ain’ and ‘swad ’ in medial forms . . . . . . 94
Figure 6.7 Confusing pair of ‘meem’ and ‘swad ’ in initial forms . . . . . 94
Figure 6.8 Confusing pair of ‘daal ’ and ‘wao’ in terminal forms . . . . . 94
Figure B.1 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Figure B.2 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Figure B.3 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
xiii
Figure B.4 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Figure B.5 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Figure B.6 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Figure B.7 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Figure B.8 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Figure B.9 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Figure B.10 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Figure B.11 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Figure B.12 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Figure B.13 A handwritten ensemble of all Urdu characters written on
the canvas with the help of a stylus and digitizing tablet by
writer-13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
xiv
List of Tables
Table 1.1 Comparison of the proposed online Urdu handwritten char-
acter recognition method with Arabic work . . . . . . . . . . 18
Table 1.2 Comparison of the proposed online Urdu handwritten char-
acter recognition method with Persian work . . . . . . . . . . 18
Table 1.3 Comparison of online Urdu handwritten character recognition 19
Table 4.1 Pre-classification of Urdu character-set. The encircled num-
bers indicate the cardinality of final stage subsets that could
be obtained with the help of the proposed pre-classifier . . . . 54
Table 4.2 Characters recognized at pre-classification stage and don’t
require any further classification . . . . . . . . . . . . . . . . 55
Table 6.1 Features-classifier Summary . . . . . . . . . . . . . . . . . . . 73
Table 6.2 ANN configurations (trained using wavelet db2 approxima-
tion and detailed coefficients). . . . . . . . . . . . . . . . . . 76
Table 6.3 RNN configurations (trained using sensory input values). . . . 80
Table 6.4 DBN configurations (trained using wavelet dmey approxima-
tion and detailed coefficients). . . . . . . . . . . . . . . . . . 81
Table 6.5 Recognition rates for each subset of handwritten half-form
Urdu characters obtained from the pre-classifier. Results ob-
tained with using ANNs, and SVMs using different features
are presented for comparison. . . . . . . . . . . . . . . . . . . 83
Table 6.6 Recognition rates for each subset of handwritten Urdu char-
acters obtained from the pre-classifier. Results obtained with
DBN, AE-DBN, AE-SVM and RNN using different features
are presented for comparison. . . . . . . . . . . . . . . . . . . 84
xv
Table 6.7 Recognition rates for half-form Urdu characters without go-
ing through pre-classification. Results are obtained using
SVMs, DBN, AE-DBN, AE-SVM and RNN using different
features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Table 6.8 Characters accuracy chart . . . . . . . . . . . . . . . . . . . . 88
Table 6.9 Confusion matrix for 4-stroke characters (initial half-form)
with dot diacritic above the major stroke . . . . . . . . . . . 89
Table 6.10 Confusion matrix for initial half-forms 2-stroke characters
with other-than-dot diacritic above the major stroke. Overall
accuracy for this subset is 91.9% . . . . . . . . . . . . . . . . 90
Table 6.11 Confusion matrix for medial half-form 2-stroke characters
with dot diacritic above the major stroke. Overall accuracy
for this subset is 93.6% . . . . . . . . . . . . . . . . . . . . . 90
Table 6.15 Confusion matrix for medial half-forms 2-stroke characters
with other-than-dot diacritic above the major stroke. Overall
accuracy for this subset is 93.3% . . . . . . . . . . . . . . . . 92
Table 6.16 Confusion matrix for terminal half-forms 2-stroke characters
with dot diacritic above the major stroke. Overall accuracy
for this subset is 96.7% . . . . . . . . . . . . . . . . . . . . . 92
Table 6.17 Confusion matrix for terminal half-forms 4-stroke characters
with dot diacritic above the major stroke. Overall accuracy
for this subset is 99.6% . . . . . . . . . . . . . . . . . . . . . 93
Table A.1 Confusion matrix for single-stroke characters (initial half-
form). It contains 7 characters. Overall accuracy: 94.7% . . . 99
Table A.2 Confusion matrix for 2-stroke characters (initial half-form)
with dot diacritic above the major stroke. It contains 6 char-
acters. Overall accuracy: 99.1% . . . . . . . . . . . . . . . . . 100
Table A.3 Confusion matrix for 2-stroke characters (initial half-form)
with other-than-dot diacritic above the major stroke. It con-
tains 6 characters. Overall accuracy: 91.9% . . . . . . . . . . 100
xvi
Table A.4 Confusion matrix for 2-stroke characters (initial half-form)
with dot diacritic below the major stroke. It contains 3 char-
acters. Overall accuracy: 97.2% . . . . . . . . . . . . . . . . . 100
Table A.5 Confusion matrix for 2-stroke characters (initial half-form)
with other-than-dot diacritic below the major stroke. It con-
tains 2 characters. Overall accuracy: 98.3% . . . . . . . . . . 101
Table A.6 Confusion matrix for 3-stroke characters (initial half-form)
with dot diacritic above the major stroke. It contains 3 char-
acters. Overall accuracy: 94.4% . . . . . . . . . . . . . . . . . 101
Table A.7 Confusion matrix for 3-stroke characters (initial half-form)
with other-than-dot diacritic above the major stroke. It con-
tains 2 characters. Overall accuracy: 100% . . . . . . . . . . 101
Table A.8 Confusion matrix for 4-stroke characters (initial half-form)
with dot diacritic above the major stroke. It contains 3 char-
acters. Overall accuracy: 88.8% . . . . . . . . . . . . . . . . . 101
Table A.9 Confusion matrix for 4-stroke characters (initial half-form)
with dot diacritic below the major stroke. It contains 3 char-
acters. Overall accuracy: 92.7% . . . . . . . . . . . . . . . . . 102
Table A.10 Confusion matrix for single-stroke characters (medial half-
form). It contains 8 characters. Overall accuracy: 89.1% . . . 102
Table A.11 Confusion matrix for 2-stroke characters (medial half-form)
with dot diacritic above the major stroke. It contains 8 char-
acters. Overall accuracy: 93.6% . . . . . . . . . . . . . . . . . 102
Table A.12 Confusion matrix for 2-stroke characters (medial half-form)
with other-than-dot diacritic above the major stroke. It con-
tains 4 characters. Overall accuracy: 93.3% . . . . . . . . . . 103
Table A.13 Confusion matrix for 2-stroke characters (medial half-form)
with dot diacritic below the major stroke. It contains 2 char-
acters. Overall accuracy: 98.3% . . . . . . . . . . . . . . . . . 103
Table A.14 Confusion matrix for 3-stroke characters (medial half-form)
with dot diacritic above the major stroke. It contains 2 char-
acters. Overall accuracy: 95.0% . . . . . . . . . . . . . . . . . 103
xvii
Table A.15 Confusion matrix for 3-stroke characters (medial half-form)
with other-than-dot diacritic above the major stroke. It con-
tains 2 characters. Overall accuracy: 95.8% . . . . . . . . . . 103
Table A.16 Confusion matrix for 4-stroke characters (medial half-form)
with dot diacritic above the major stroke. It contains 2 char-
acters. Overall accuracy: 95.8% . . . . . . . . . . . . . . . . . 104
Table A.17 Confusion matrix for 4-stroke characters (medial half-form)
with dot diacritic below the major stroke. It contains 2 char-
acters. Overall accuracy: 100% . . . . . . . . . . . . . . . . . 104
Table A.18 Confusion matrix for single-stroke characters (terminal half-
form). It contains 16 characters. Overall accuracy: 94.7% . . 104
Table A.19 Confusion matrix for 2-stroke characters (terminal half-form)
with dot diacritic above the major stroke. It contains 9 char-
acters. Overall accuracy: 96.7% . . . . . . . . . . . . . . . . . 105
Table A.20 Confusion matrix for 2-stroke characters (terminal half-form)
with other-than-dot diacritic above the major stroke. It con-
tains 7 characters. Overall accuracy: 99.0% . . . . . . . . . . 105
Table A.21 Confusion matrix for 3-stroke characters (terminal half-form)
with dot diacritic above the major stroke. It contains 3 char-
acters. Overall accuracy: 99.4%. . . . . . . . . . . . . . . . . 105
Table A.22 Confusion matrix for 4-stroke characters (terminal half-
forms) with dot diacritic above the major stroke. It contains
4 characters. Overall accuracy: 99.6% . . . . . . . . . . . . . 106
Table A.23 Confusion matrix for single-stroke characters (initial half-
form). It contains 7 characters. Overall accuracy: 95.4% . . . 107
Table A.24 Confusion matrix for 2-stroke characters (initial half-form)
with dot diacritic above the major stroke. It contains 6 char-
acters. Overall accuracy: 99.0% . . . . . . . . . . . . . . . . . 107
Table A.25 Confusion matrix for 2-stroke characters (initial half-form)
with other-than-dot diacritic above the major stroke. It con-
tains 6 characters. Overall accuracy: 89.3% . . . . . . . . . . 108
xviii
Table A.26 Confusion matrix for 2-stroke characters (initial half-form)
with dot diacritic below the major stroke. It contains 3 char-
acters. Overall accuracy: 98.6% . . . . . . . . . . . . . . . . . 108
Table A.27 Confusion matrix for 2-stroke characters (initial half-form)
with other-than-dot diacritic below the major stroke. It con-
tains 2 characters. Overall accuracy: 96.0% . . . . . . . . . . 108
Table A.28 Confusion matrix for 3-stroke characters (initial half-form)
with dot diacritic above the major stroke. It contains 3 char-
acters. Overall accuracy: 94.6% . . . . . . . . . . . . . . . . . 109
Table A.29 Confusion matrix for 3-stroke characters (initial half-form)
with other-than-dot diacritic above the major stroke. It con-
tains 2 characters. Overall accuracy: 100% . . . . . . . . . . 109
Table A.30 Confusion matrix for 4-stroke characters (initial half-form)
with dot diacritic above the major stroke. It contains 3 char-
acters. Overall accuracy: 86.6% . . . . . . . . . . . . . . . . . 109
Table A.31 Confusion matrix for 4-stroke characters (initial half-form)
with dot diacritic below the major stroke. It contains 3 char-
acters. Overall accuracy: 94.0% . . . . . . . . . . . . . . . . . 109
Table A.32 Confusion matrix for single-stroke characters (medial half-
form). It contains 8 characters. Overall accuracy: 93.7% . . . 110
Table A.33 Confusion matrix for 2-stroke characters (medial half-form)
with dot diacritic above the major stroke. It contains 8 char-
acters. Overall accuracy: 90.0% . . . . . . . . . . . . . . . . . 110
Table A.34 Confusion matrix for 2-stroke characters (medial half-form)
with other-than-dot diacritic above the major stroke. It con-
tains 4 characters. Overall accuracy: 93.0% . . . . . . . . . . 110
Table A.35 Confusion matrix for 2-stroke characters (medial half-form)
with dot diacritic below the major stroke. It contains 2 char-
acters. Overall accuracy: 100% . . . . . . . . . . . . . . . . . 111
Table A.36 Confusion matrix for 3-stroke characters (medial half-form)
with dot diacritic above the major stroke. It contains 2 char-
acters. Overall accuracy: 97.0% . . . . . . . . . . . . . . . . . 111
xix
Table A.37 Confusion matrix for 3-stroke characters (medial half-form)
with other-than-dot diacritic above the major stroke. It con-
tains 2 characters. Overall accuracy: 98.0% . . . . . . . . . . 111
Table A.38 Confusion matrix for 4-stroke characters (medial half-form)
with dot diacritic above the major stroke. It contains 2 char-
acters. Overall accuracy: 96.0% . . . . . . . . . . . . . . . . . 111
Table A.39 Confusion matrix for 4-stroke characters (medial half-form)
with dot diacritic below the major stroke. It contains 2 char-
acters. Overall accuracy: 100% . . . . . . . . . . . . . . . . . 112
Table A.40 Confusion matrix for single-stroke characters (terminal half-
form). It contains 16 characters. Overall accuracy: 96.3% . . 112
Table A.41 Confusion matrix for 2-stroke characters (terminal half-form)
with dot diacritic above the major stroke. It contains 9 char-
acters. Overall accuracy: 91.5% . . . . . . . . . . . . . . . . . 113
Table A.42 Confusion matrix for 2-stroke characters (terminal half-form)
with other-than-dot diacritic above the major stroke. It con-
tains 7 characters. Overall accuracy: 99.4% . . . . . . . . . . 113
Table A.43 Confusion matrix for 3-stroke characters (terminal half-form)
with dot diacritic above the major stroke. It contains 3 char-
acters. Overall accuracy: 100%. . . . . . . . . . . . . . . . . . 113
Table A.44 Confusion matrix for 4-stroke characters (terminal half-
forms) with dot diacritic above the major stroke. It contains
4 characters. Overall accuracy: 98.5% . . . . . . . . . . . . . 114
xx
Abstract
This thesis presents an online handwritten character recognition system for Urdu
handwriting. The main target is to recognize handwritten script inputted on the
touch screen of a mobile device in particular, and other touch input devices in
general. Urdu alphabets are difficult to recognize because of inherent complexities
of the script. In a script, Urdu alphabets appear in full as well as in half-forms:
initials, medials, and terminals. Ligatures are formed by combining two or more
half-form characters. The character-set in half-forms has 108 elements. The whole
character-set of 108 elements is too difficult to be classified accurately by a single
classifier.
In this work, a framework for development of online Urdu handwriting
recognition system for smartphones has been presented. A pre-classifier is de-
signed to segregate the large Urdu character-set into 28 smaller subsets, based on
the number of strokes in a character and the position and shape of the diacrtics.
This pre-classification allows to cope with the demand of robust and accurate
recognition on processors having relatively low computational power and limited
memory available to mobile devices, through banks of computationally less com-
plex classifiers. Based on the decision of the pre-classifier, the appropriate classi-
fier from the bank of classifiers is loaded to the memory to achieve the recognition
task. A comparison of different classifier-feature combinations is presented in this
study to exhibit the features’ discrimination capability and classifiers’ recognition
ability. The subsets are recognized with different machine learning algorithms
such as artificial neural networks, support vector machines, deep belief networks,
long short-term memory recurrent neural networks, autoencoders-support vector
machines, and autoencoders-deep belief networks. These classifiers are trained
with wavelet transform features, structural features, and with sensory input val-
ues. Maximum overall classification accuracy of 97.2% has been achieved. A large
database of handwritten Urdu characters is developed and employed in this study.
This database contains 10800 samples of the 108 Urdu half-form characters (100
samples of each character) acquired from 100 writers.
xxi
List of Publications and Patents
Journal Publication:
• Safdar, Quara tul Ain, Khan, Kamran Ullah, and Peng, Lianguri, “A Novel
Similar Character Discrimination Method for Online Handwritten Urdu
Character Recognition in Half Forms”, Scientia Iranica, vol. , pp. , 2018.
ISSN=“1026-3098”, DOI=“10.24200/sci.2018.20826”
Conference Publication:
• Q. Safdar and K. U. Khan, “Online Urdu Handwritten Character Recog-
nition: Initial Half Form Single Stroke Characters”, in 12th International
Conference on Frontiers of Information Technology, Dec 2014, pp. 292–297.
xxii
List of Abbreviations and
Symbols
AE AutoEncodersANN Artificial Neural NetworkBPNN Back Propagation Neural Network
BRNN Bidirectional Recurrent Neural NetworkDBN Deep Belief Network
GUI Graphical User Interface
IHF Initial Half-FormLSTM Long Short-Term Memory
MHF Medial Half-FormMLP Multi-Layer Perceptron
NLPD National Language Promotion Department
OCR Optical Character Recognition
OS Operating System
PDA Personal Digital Assistant
POS Point of SaleRBF Radial Basis FunctionRBM Restricted Boltzmann MachineRNN Recurrent Neural NetworkSC Stroke CountSVC Support Vector Classifier
SVM Support Vector Machine
THF Terminal Half-FormUK United Kingdom
USA United Staes of America
xxiii
Chapter 1
Introduction
Online handwritten character recognition is a process in which data-stream for
handwritten characters is collected, recognized and converted to editable text as
the writer writes on a digital surface [1], [2]. The digital surface may be a tablet or
any hand-held device (like personal digital assistant, smartphone etc.) that allows
handwriting on its surface either by an electronic pen/stylus or with a finger-tip.
1.1 Background
Writing by hand, an illustration of a synchronization of mind and body, is one of
the most mesmerizing and influential inventions of human beings. It is seeded in
artistic depictions engraved on rocks, etched in sand, and marked on walls that at
last morphed into alphabets [3] (see Figure 1.1), ligatures, graphemes, and words.
Each hand-drawn shape, each handwritten word is not mere a scribbling expres-
sion but the most natural way of information exchange. It reminds us that we
are still conducting the ancient act of using hands to transcribe what rests in our
minds. Reading and writing play a vital role to develop a civilized society. Since
its early days thereabouts 5000 years ago in Mesopotamia and Egypt different
symbols (alphabets) were coined [4] in order to save thoughts and facts. Symbols
were imprinted or scratched in clay, or drawn on parchment, wax tablets, and
papyrus with the help of quill pens and reed pens. They also made use of thin
metal sticks called stylus (pl. styluses or styli) for writing in wax tablets and for
palm-leaf manuscripts. With the passage of time, interaction among individuals
1
Introduction
and tribes increased. Kingdoms expanded with which keeping track of historical
and environmental events became the calendrical and political necessity to survive
and rule. The complexity of administrative actions and trade transactions out-
grew human memory. It required the administrators and traders to keep record
of administrative affairs and transactions in some permanent form [5]. The obser-
vance of this substantial requirement helped writing to evolve as a more reliable
method for registering and presenting the matters and events, deals and deeds,
actions and transactions, and many other goings-on. Earlier the implements or
instruments used to write something were quills, reeds, and metallic sticks. To
speedup the writing process the writing implements were gradually complemented
by letterpress, stamps, chalks, split-nib pens, dip pens, graphite pencils etc. [6].
With the development of pen and paper handwriting became prevailing mode for
documentation. Afterwards, the handwriting turned to be the part of literacy
culture, qualified as a rudiment of academics and considered imperative to pro-
fessional life. Nowadays, the mode of writing is going through a dramatic change.
Early
Semitic
1800
Phoenician
1100 1100 900Greek
800 700Roman
100 CE
Ox
House
Throwing
Stick
Egyptian
2000 BCE
Early
Semitic
1800
Phoenician
1100
Roman
100 CE
Greek
600
Egyptian
2000
Sumerian
4000 BCE
Early
Semitic
1800
Phoenician
1100
Roman
100 CE
Greek
600
Early
Latin
300
Early
Hebrew
1000
Egyptian
2000 BCE
Figure 1.1: Ancient symbols for alphabets [7]
With the emergence of smart IT equipment and digital writing devices it is be-
ing observed that writing by hand is getting less and less in our daily lives [8].
These days it is just to use a keypad or a touch-sensitive screen to do number
of jobs with a single keypress or tap, like operating machines, withdrawing cash,
2
Introduction
filling forms, searching a book or an article in an online repository, posting mes-
sages, uploading images, adding animations and much more. As we type merrily
on keypads or signal to touch-screens, handwriting certainly seems like a dying
form. In this scenario one of the two questions should be addressed; whether this
withering away of handwriting is really a setback or it is inexorable evolution of
language-forms unfolded over the centuries from oral to written to printed, and
now in the form of electronic-ink [9]. Sometimes, at some places it’s just signing
naively on a credit-card-pay-screen using mere a finger or scribbling a signature
with an electronic-pen at the grocery store which tells us that handwriting needs
not fold up and die. Moreover, the desire to coalesce/fuse the convenience of
handwriting with the need to use, maintain, and communicate digital information
requires the digital industry to embed handwritten-input into hand-held devices,
smart boards and smartphones, tablet PCs, personal digital assistants (PDAs),
and other ubiquitous computing devices. From mainframes to ubiquitous devices,
shaping of personal computers and miniature devices have taken an intellectual
leap. Undoubtedly, invention of transistors and ICs revolutionized the technologi-
cal means, however mere the availability of instruments and devices cannot ensure
the breakthroughs. Computing for portable devices and smart environments en-
hanced the human-machine interaction and becomes the key turning point in the
modern world. Looking back, we see that primarily the mainframes were the ma-
chines shared by lots of people. Afterwards, in personal computers era people were
put into a computer generated virtual reality while the user and machine were star-
ing at each other across the desktop. In the current time, the mobile computing
made the machines to live out here in the physical world with the users. Mo-
bile devices which simply started as portable telephones evolved into smartphones
and smart computing devices. Undoubtedly, this evolution of portable computing
devices reshaped the world of personal computers. An important change that hap-
pened with this development is the change in mode of input to portable devices.
Soft(ware)-keyboard-replicas replaced the hard keyboarding which led the atten-
tion towards non-keyboard-based interfaces. An interface which interacts with
the device by taking input either through a pen or a finger-tip(s) is said to be a
non-keyboard-based interface. Input through a pen moving on a tablet or through
3
Introduction
a finger-tip tapping on a touch-screen swayed the research communities to design
and develop the interfaces that could realize the handwritten inputs efficiently.
1.2 Place of Handwriting in Digital Age
Handwriting represents a person’s identity and forms a unique part of a civiliza-
tion. It is less restrictive, more functional, and creative as compared to keyboard-
ing which is a digital counterpart of handwriting. Handwriting of an individual
and handwritten scripts of a society show evolution of text not only for a per-
son but more importantly for a civilization. Written languages either made up of
letters like Latin, English, German, Devanagari, Arabic, Urdu etc. or consist of
characters like Mandarin, Japanese etc. are the examples for evolution of text.
One can see through handwritten documents what went before. Writing by hand
is an integral part of not only our daily life but also a learning tool in any edu-
cational system. It is developed as a functional skill because of majority of our
academic examinations are still handwritten. Even a good handwriting can serve
as a benefit in scholastics. Usually the students who can write legible gets an ad-
vantage over those who cannot. Although technological means are becoming part
of our class rooms yet students’ ability to write clearly is still the center of atten-
tion. We all know that writing by hand is less restrictive. It gives the writer a free
hand to write things and thoughts in any style, draw any kind of shapes, connect
different sections together, scribble side notes, encircle important information and
much more wherever and whenever it makes sense. Besides retaining creative flow
use of pen also comes up with cognitive benefits. Writing and rewriting notes and
information by hand makes it more likely to remember it.
1.3 Word Processing Software
On the other hand, with the development of word processing software, creation,
updating and maintenance of documents can be viewed on a different level. Doc-
ument is typed up, saved with a single click, gone through editing as many times
as necessary. Pictures, shapes and diagrams are allowed to add although graphics
made in word processing softwares are often not as sophisticated as those created
4
Introduction
with specialized programs. Spelling and grammatical mistakes can be corrected
using in-built spell and grammar checking option. Text formatting, margin ad-
justments, and page layout settings are available to make the document look more
appealing, easy to read, and above all in a standard format. Generating multiple
copies and keeping older to newer versions of a document becomes an easy task
with the help of word processors. Converting the document from soft-form to
hard-form, that is to say, taking a printout is nothing but mere a story of a click
(if a printer is already installed). Moreover, the availability of the document files
on various platforms, and their synchronization across multiple devices have made
the documents handling somewhat an easier job.
However, the other side of the picture narrates that typing up in a lan-
guage which uses alphabets different from English (Latin script) is not a trivial
exercise. In fact, there are a number of languages which do not follow Latin
script and therefore have different character-sets. For example, Bulgarian, Be-
larusian, Russian, Ukrainian, Macedonian, Serbian, Old Church Slavonic, Church
Slavonic use Cyrillic alphabets. Bengali, Devanagari, Gurmukhi, Gujarati, and
Tibetan belong to Brahmic family of scripts. Urdu follows Arabic and Persian
like scripts. Chinese, Japanese, Korean, Hebrew, Greek, Armenian, Georgian,
each has its own set of alphabets/characters not matching with Latin alphabets
normally found on a standard keyboard. Moreover, Latin characters with diacritic
(circumflex or umlaut), part of some Latin script based languages (e.g. German,
French, Swedish, Finnish, Spanish, Italian etc.) are not easily accessible on a key-
board. Similarly taking the example of Japanese language used in daily life, there
are more than 3000 Kanji and Kana (Chinese ideographs (Kanji) and Japanese
syllabaries (Kana) where each of the syllabaries appears for one consonant-vowel
pair) characters and digits. Even though designating a nominal subset from larger
character-set (of 3000 characters), there would be at least 100 characters in the
subset. This subset is even too large to input through a keyboard for an ordinary
user [10]. Furthermore, incorporating complex mathematical symbols and equa-
tions in a document is not that much straight forward as that of typing a simple
English sentence. It is quite easier to hand write an equation (on a hard copy of
the document) than using some equation typing software.
5
Introduction
1.4 Integrating Handwriting with Technology
From a technology user’s point of view, such machines are receiving a warm wel-
come in which ease of human-machine interaction is well focused. Input through
handwriting is one such example of convenience that is being tried to provide in
smart machines. So what if we merge the convenience of handwriting with the
smartness of machines?
Earlier, personal computers and machines were provided with keyboards
and keypads. On a keyboard there are two ways to type, either by using two fingers
(Hunt and Peck Method, also called Eagle Finger Typing) or using both hands
where the fingers are set down on A, S, D, F and J, K, L keys and thumbs are used
to access the space bar (Touch-Typing or Touch-Keyboarding). In touch-typing, a
string of keys is typed pressing the keys one finger down at a time without looking
at the keyboard. A typed sentence is obtained through a series of coordinated and
automatized movements of fingers. However in touch-typing, to press the right
key with the right finger requires some beginner’s knowledge. Moreover, typing
rehearsal becomes necessary so that brain can learn coordination of fingers and
intricate movements of fingers could be executed easily at first and at last speedily.
Touch-keyboarding has already been replaced by touch-sensitive screens, panels
and interfaces in writing pads, smartphones, tablets, phablets and many other
portable and functional common electronics, and even in non-portable machines.
A touch-sensitive screen is a device which acts not only as an input device but also
as an output device. Displayed options on a touchscreen (output) can be chosen by
touching the screen (input) with the help of finger(s) or a special stylus (however,
for most modern touch-screens stylus has become an optional choice). The use of
touch-screens is established in various fields like heavy industry, medical, commu-
nication etc.; especially in those areas where keyboard and mouse may not permit
a suitably intuitive, instantaneous, or precise and accurate interaction between
the user and displayed content like kiosks, ATMs, point of sale (POS) systems,
electronic voting machines etc. Varying from machine to machine either a menu
driven interface or ‘app-icons’ are provided with touch-screens to access different
options or applications. Certainly, the technology with embedded touch screens
and intuitive user interfaces has brought great convenience to human-machine in-
6
Introduction
teraction. However, well effective interfacing is not an easy task. Moreover, there
are scenarios, like note taking, drawing/painting or electronic document annota-
tion, where a significant amount of data is taken as input, for which mere touch
interaction is not enough. To make these tasks easier and more natural there
should be other input methods. Today, for natural writing, note taking and draw-
ing, a pen or an active stylus can be viewed as the most potential device among
all input devices. Being precise and more intuitive, a stylus/pen can brush up
the user’s experience of touch-devices. Styluses/Pens are portable and have ex-
tendable functions of pressure sensitivity measurement and auxiliary customizable
buttons for different tasks. Instead of going through menus via touch or click it
is easier to write using a stylus and the required activity will be done. However,
‘from handwritten-command to task-done’ requires logically rich and an efficient
interfacing.
1.5 Difficulties Involved in Handwriting Recognition
As stated above, developing an interface that could recognize and respond effi-
ciently to handwritten input(s) is a non-trivial job. The task of efficient interfacing
is difficult mainly because of two reasons. At first, ‘handwriting’ by itself and at
second ‘hardware’ resources for processing in portable devices. Writing by hand
either by using a simple led-pencil on a paper or with the help of a pen/stylus on
a smart-screen inherits complications from versatile nature of handwriting(s). It
also owes complexities of the language in which the input command has been writ-
ten. Each writable language follows a particular script and each script has its own
alphabets and writing standards. Some scripts allow cursive style of writing, gen-
erally intended for making handwriting faster while some others are non-cursive,
in which writing style follows a ‘printscript’ where letters of a word are not con-
nected to each other. Certainly, the very nature of the language script propounds
difficulties to the interface development.
Nature of the script and versatility in writing by hand are not the only
challenges that make the handwriting recognizable interfacing a tough job. There
are other factors also to which developers would have to deal with. Speed of
writing is one such factor. Humans write things more quickly than they type up
7
Introduction
on a touch-keyboard or on a touch-screen. It requires that the technology used for
integrating handwriting should have fast response-rate so that it could reproduce
the shapes drawn in accordance to the speed of the writer. The technology should
also has to respond according to various delicate aspects of handwriting like the
force with which writing instrument has been used, the tilt of the nib at varying
angles, the quick or might be a slow rotation of the pen at various degrees. While
writing humans are habitual of resting their palm/wrist on writing surface or even
fingers other than used for holding the pen/stylus might touch the writing surface.
If this habit continues to go on a pen-tablet or on a stylus-touchscreen then the
display must be smart enough to distinguish between writing (stylus) function
and touch function. Another important aspect of handwriting is the first-touch-
latency or touch lag. A real pen does not leave a time gap (first-touch-latency or
touch lag) between inking and writing. A touch-latency for touch surfaces is how
fast a touch is registered on the surface. In other words, there is a delay between
actual physical input occurrence and that input being processed electronically
and displayed on an output device. For any interaction, according to Robert B.
Miller [11], the minimum just noticeable time difference related to the response
time of the system is 100 milliseconds. Humans are quicker and can respond even
in few milliseconds. Therefore, to replicate the function of handwriting, there is a
need to devise such devices which could catch up with human response.
Digging further into technology means and measures, we see that there are
limitations in capability of hardware resources available for portable devices. Two
main limitations, in respect of hardware resources, are coherent to smartphones,
tablets and other portable devices. On one side, for processing purposes, relatively
less speedy processors are available in smart devices. While on the other side ran-
dom access memory and auxiliary storage space available for or attached to these
devices cannot be enhanced to more than a certain limit. These lacks in resource-
fulness does not allow the developer to opt for some quick but resource consuming
techniques but that may open the new horizons of logics for the developer in which
these challenges could be coped efficiently.
8
Introduction
1.6 Online and Offline Handwriting Recognition
All of the above discussion is regarding online handwriting recognition. The terms
dynamic, and real time handwriting recognition are also used in place of online
handwriting recognition. It is a system in which handwriting is converted to text
as it is registered on special digitizer, smartphone, PDAs, or any other appro-
priate hand-held device(s). In simple words, the machine recognizes the writing
while the writing process is in progress [12]. In this type of recognition system,
a transducer (e.g. PDAs, smartphones, etc.) records pen-tip movements and
pen-up/pen-down events. The data generated against pen-tip movements and
pen-up/pen-down events is known as digital ink. This is nothing but digital rep-
resentation of handwriting. The elements of online handwriting recognition system
include a stylus/pen, a touch sensitive surface either embedded in, or attached to
some output display and a software that could translate pen movements across
the touch surface into writing strokes and digital text. The input through pen
is dynamic and stated as a function of time and order of the pen-stroke. The
digital representation of the input (pen movements actually) is a time dependent
sequential data based on pen trajectory. It gives not only the two dimensional in-
formation about position, velocity, and acceleration but also recounts the pressure
values, number of strokes, stroke order, and stroke direction.
Offline handwriting recognition or optical character recognition (OCR), in
contrast to online handwriting recognition, is conducted after the writing activity
is completed. In offline handwriting recognition, a raster image of the typed,
printed, or handwritten text is taken from an optical scanner or any other digital
input source (e.g. digital camera). The text might be typed, printed, or written
by hand on a document, on a sign board, on a billboard etc. It might be a
caption superimposed on some picture, photograph, figure etc. It may also be
subtitles embedded into a video or movie. The digital devices like optical scanners
or digital cameras yield the bit pattern of the image of the typed, printed or
handwritten text. After the handwriting is made available in the form of an
image, the recognition task can be performed at some later time, for example,
days, months, or even after years. The image obtained for offline recognition is
converted to a binary or colored image. Binary image is an image in which image
9
Introduction
pixels are either 0 or 1. To acquire a binary version of an image threshold technique
is used. The technique is applicable to both colored and gray scale images.
1.6.1 Dynamic Information acquired through Online Hardware
Today, with the technological development, we are able to get real time informa-
tion for a given process. One such example of this ascent can be seen for online
handwriting devices. Online handwriting hardware has risen up to that matu-
rity level that first hand information can be obtained instantaneously. It is not
just that but also this first hand information proves to be helpful to extract and
compute further information very easily. Instantaneously acquired information
include:
• Precise loci of the pen as a function of time. This also includes retraces of
the stroke made by the writer
• Pen inclination as a function of time. It reflects the trend of the pen/stylus
movement
• Pen pressure value for each pen locus
• A portrait of the full -stroke. It comprehends pen-down and pen-up events
listing all intermediary points between each pen-down and pen-up event
• Temporal connection between major and minor strokes to from a character
Above information can further be processed to yield the following:
• Velocity and acceleration with which a stroke is penned down
• Direction of the pen stroke
• Number of strokes with which a character is drawn
• Order of the strokes to form a character
• Variations at beginnings and endings of the strokes
• Variations in stroke length and width
10
Introduction
All the dynamic information associated to how a character has been written is lost
for scanned version of that handwritten character. That is why offline images of
handwritten scripts are referred to as static images. It is hard to acquire dynamic
features from static images. However, with today’s available online hardware,
dynamic attributes can be obtained and drawn with quite reasonable accuracy.
1.6.2 Advantages of Online Handwriting Recognition over the Offline
Handwriting input is natural and appealing style of input. That is why it is more
acceptable over keyboarding. As the information is immediately available and
processed for online handwriting signal therefore the work-flow is improved. The
main difference between online and offline is handwriting data capturing method.
In online recognition systems, machine/handwritten data is captured at the instant
the person writes on a writing surface. In offline recognition, data is captured at
some later time after the writing is created. The advantage of online data recording
is that online devices also capture temporal information of handwritten stroke
which is not available in offline images. Temporal information helps to keep track
of stroke order and direction. Such information may not seem much beneficial
for languages, like English, where stroke order does not matter. However, the
languages, like Chinese, Arabic, Urdu etc., in which writing a character is stroke-
order dependent, temporal information becomes more favorable to recognition
process. Temporal difference of writing can be used to identify and discard the
overlapping of strokes in a written character.
Another advantage of online system over offline is that online system ren-
ders interactivity between the writer and the device at the time of writing. This
interaction also allows the user to edit and/or rectify the mistakes immediately
which let the recognition errors be corrected at the spot. On the other hand, in of-
fline systems, writer-machine interaction does not occur when the writing activity
is in process. In fact, as stated above, in offline recognition system the interaction
with machine/device happens only after the writing is materialized. The result
of this script-machine interaction is a scanned or digital image which is further
dispensed to some recognition process.
11
Introduction
Adaptation is another advantage of online recognition system. Two possi-
bilities of adaptation can be observed for online recognition systems. One is writer
to machine adaptation and the other is machine to writer adaptation. Writer to
machine adaptation brings advantage where the writer sees that some of the writ-
ten characters are not correctly recognized he may modify his drawings to improve
the recognition. Machine to writer adaptation benefits when the recognizer has
the ability of adapting to the writer. Such recognizers have capability of storing
writer’s samples of handwritten stroke for subsequent recognition process.
On the other hand, to produce more respectable results, offline character
recognition systems are subject to some constraints. For example, the scanned or
digital image fed to recognition engine should have clear contrast between image
colours with even lighting exposure. For a good chance of recognition, specific
image resolution is another important factor. For example, for text recognition in
a document with Google’s OCR software built into Google Drive, the resolution
of the document in height should be at least 10 pixels to increase recognition
probability. Good recognition results might also be font dependent. With Google’s
OCR for best results the document should be prepared in Arial or Times New
Roman font (English script). Offline recognition engines also put a constraint on
file format (jpg, tiff, png, pdf etc.), text layout (single/multicolumn), skewness,
brightness and other layouts of the scanned image/document. For example, earlier
versions of Tesseract engine, originally developed by Hewlett Packard Labs, were
not able to process images for text in two columns and other than TIFF format.
1.6.3 Available Handwriting Recognition Software
There are many application softwares available for offline character recognition.
Following are a few to mention in this regard:
• Tesseract OCR, a free software for character recognition, supported by
Google since 2006 [13]. Initially developed for English language text but
now it can recognize text in more than 100 languages [14] in printed font. In
terms of character recognition accuracy, Tesseract OCR has been considered
one of the most accurate OCR engine [15], [16], [17]. The output format is
text, hOCR, pdf and others with different APIs.
12
Introduction
• Google’s OCR is provided with Google Drive and can recognize 100+ lan-
guages with over 90% accuracy. It can take images (jpg, png) as well as
multipage pdf documents as input for recognition. For Urdu handwritten
text, the OCR has not been found very accurate.
• IRIS-Readiris for MAC and Windows operating systems (OS) implement
optical recognition technology and convert images and pdf files into editable
files. The converted file format may be of Word, Excel, PDF, HTML etc.
and can be chosen by the user. The software keeps the original layout intact.
IRISDocument Server is a server-based OCR solution which automatically
converts unlimited volumes of images into fully editable but structured for-
mats. It also offers hyper-compression of the converted documents for long
or short term archiving. It can deal with more than 100 languages for text
recognition [18].
• ABBYY FineReader OCR software converts the digital photographs and
scanned documents either in image form or pdf format into emendable for-
mats. The output document format may be any of the RTF, TXT, DOC,
DOCX, PDF, XLS, XLSX, HTML, PPTX, CSV, EPUB, DjVu, ODT, or
FB2 format as per user choice. The OCR can recognize 192 languages [19].
• CuneiForm, developed by Cognitive Technologies (a Russian software com-
pany), converts electronic copies of images and documents into editable for-
mats. The editable conversion of electronic copies is accomplished without
changing the fonts and structure of the document/image. It can recognize 28
languages in any printable font (including Russian-English bilingual, French,
German, and Turkish) saving the output in hOCR, HTML, TeX, RTF, or
TXT formats.
• OmniPage is another OCR software. With auto language detection feature,
it can recognize 120 different languages converting the scanned documents
into searchable and editable electronic versions. The output file matches
exactly to original input document in color, font, and layout. It is vended
by Nuance Communications [20].
13
Introduction
Above are a few OCR application softwares to mention. In fact, there is a long list
for offline character recognition softwares including OCR Using Microsoft OneNote
2007, Office Lens (an OCR application by Microsoft for mobile phones), OCR Us-
ing Microsoft Office Document Imaging, PDF Scanner (a document scanner soft-
ware with OCR technology, also available for Android users), ONLINE OCR [21]
(for personal computers an online facility for OCR/Free, Arabic/Persian/Urdu
languages are not supported), SimpleOCR (freeware) and SimpleOCR SDK for de-
velopers (royalty free). The list is to show that there is a lot of work done on offline
version of character recognition. However, for online character recognition there
is a lack of available softwares either commercial or non-commercial. MyScript-
Nebo is an application software for online handwriting recognition provided by
Vision Objects. It turns the natural handwriting recorded through a stylus, a dig-
ital pen or a finger into computer readable information. The software is available
for Linux, Microsoft Windows, Apple MAC OS, iOS, and Android as well. There
are many hand-held and smart devices that are supported by MyScript-Nebo such
as Samsung Galaxy Tab S3 with S-Pen, Samsung Galaxy Note 10.1′′, 2014 edi-
tion with S-Pen, Samsung Galaxy Note Pro 12.2′′ with S-Pen, iPad Pro and iPad
2018 with Apple Pencil, Microsoft Surface Pro 3 (Intel Core i3, i5, i7), Microsoft
Surface Pro 4 (Intel Core m3, i5, i7), Sony Vaio 13 (Core i5), Huawei Media Pad
2 10.1′′ with active pen and many more [22]. Recognizing 59 languages [23], the
software can convert handwritten notes, mathematical equations, and geometrical
shapes into editable and searchable digital text/ink. However, the software does
not support the most widespread right-to-left languages namely Hebrew, Arabic,
Persian, and Urdu [24].
1.7 Problem Statement: Online Handwritten Urdu Char-
acter Recognition
Character recognition has enjoyed a lot of research in the recent past. Good recog-
nition systems are available commercially for alphabetical languages based on Ro-
man characters and for symbolic languages like Chinese. But languages based on
Arabic alphabets like Urdu, Pashto, Sindhi etc. do not have such recognition sys-
14
Introduction
tems. The recognition systems generally have a scanner or a camera as the input
device for off-line recognition, or a stylus/tablet as input device for online recog-
nition. These systems are used in conjunction with the input peripheral devices
like keyboards and mice. With the recent developments in electronic tablets, pen
movements and pressure content can be captured more accurately. However, in
spite of these technological developments, we see there is no such application soft-
ware which could recognize Urdu characters written by hand using a pen-tablet or
on a smartphone with a stylus. Urdu language is based on Arabic alphabets with
a larger character-set as compared to Arabic (38 characters). Urdu, due to its
large character-set and limited number of strokes, is difficult to recognize. In this
work, we have focused on recognition of Urdu characters in half-forms. Half-form
characters appear in the start, mid, or end of a word while writing cursively. The
emphasis of this thesis is to propose a technique to recognize Urdu handwritten
characters using a pen-tablet.
1.8 Motivation
Absence of handwritten character recognition application for Urdu language is
the motivation behind this work. Such an application, in this digital age, is of
national interest. It can be used in mobile phones with styluses, in personal digital
assistants, and in any portable device with pen input. In Pakistan, there are
about 200 Million inhabitants and Urdu is a primary language of communication.
According to Pakistan Telecommunication Authority, there are about 130 Million
mobile phone users [25]. According to market estimates, based on current trends
in the e-commerce sector, there could be 40 million smartphones in Pakistan in
the coming year [26]. In that scenario, there is a need to carry out research
in the field of design and development of online Urdu handwriting recognition
systems for computing devices (like smartphones) to provide benefit to the large
Urdu speaking population of the world. It will also be helpful in Urdu data
entry for people not experienced with Urdu keyboard. Moreover, it can serve
the purpose of Urdu handwriting tutor-software for children and new learners.
The application can further be extended to touch systems as well. Online Urdu
handwriting recognition system can also extend its benefits to the users of other
15
Introduction
Arabic script based languages like Persian, Uyghur, Sindhi, Punjabi and Pushto
with minor modifications.
1.9 Literature Review
Urdu script comprises of a large character-set with cursively written and contextu-
ally dependent alphabets. Being context dependent, Urdu alphabets adjust their
shapes according to the preceding and succeeding characters. In this way, for
an Urdu alphabet there are one full- and at least three different half-forms with
few exceptions. Moreover, complexities for Urdu handwriting recognition arise
not only due to cursiveness and context dependency but also because of the very
nature of an alphabet-structure, word-formation in a particular font-style, and di-
acritics involved in alphabets. Overlapping ligatures, delicate joints of characters
in a word, atilt traces, neither fixed baseline nor standard slope (in Nastalique
font style), associated dots and other diacritic symbols which may be above, be-
low or within the character, displacement of dots with base stroke’s slope and
context [27–29] are a few to shed light on the complexities of Urdu script.
On the basis of recognition target-set, Urdu handwriting recognition (both
off-line and online) can be placed into three categories: isolated- or full-form char-
acter recognition [30–33], selecting ligatures for recognition or holistic approach
(also known as segmentation-free approach) [27, 34–39], and segmentation-based
or analytical approach [40–50]. Moreover, different researchers have tried to ad-
dress the recognition problem by focusing on different aspects. For example, the
authors in [51] worked out the baseline (an imaginary line on which characters are
combined to form the ligatures) of the character stroke, the work in [52] discussed
the diacritical marks associated with characters and ligatures, and the approach
in [53] emphasized the preprocessing operations.
Following the analytical approach along with dictionary based search to ob-
tain valid characters and words, Malik et al. [30] recognized 39 isolated characters
with an overall accuracy of 93% and 200 two-unattached-character ligatures with
an accuracy of 78%. Hussain et al. [36] preferred the holistic approach, proposed
spatial-temporal-artificial-neuron for the recognition, and reported an accuracy of
85% for 15 selected ligatures only. However, their data-set lacks the aspect of
16
Introduction
generality as it was acquired from only two different writers. Husain et al. [37]
investigated the recognition system for one-, two-, and three-character-ligatures
and obtained separate results of 93% and 98% for base and secondary strokes,
respectively. Shahzad et al. [31] studied the recognition of 38 isolated Urdu char-
acters using 9 geometric features for primary stroke and 4 for secondary stroke to
achieve the accuracy of 92.8% for the data obtained only from two native writers;
however, the recognition rate diminished to 31% when the characters were scrib-
bled by an untrained non-native writer. With data scribbled by trained non-native
writers, the recognition rate barely increased to 73%. Razzak et al. [38,39] inves-
tigated the recognition system for 1800 ligatures. By utilizing the features based
on fuzzy rules and hidden Morkov model, they secured 87.6% recognition rate for
Urdu Nastalique font and 74.1% for Naskh font. Most of the work available in
the online domain of Urdu character recognition deals with ligatures and full form
recognition. Segmentation-based approaches have been applied either to segment
the ligatures from each other present in a word or to dissociate the diacritics from
the base-character [29]. Here is to note that, to the best of authors’ knowledge,
no work is found there using wavelet analysis for recognition of Urdu characters.
However, studies have been reported for Arabic and Persian characters recogni-
tion using wavelets. Therefore on the basis of alike-script and wavelet analysis the
work presented in this paper is compared with Arabic and Persian work as well.
Table 1.1 and Table 1.2 account the comparison of proposed work with Arabic
and Persian recognition systems using wavelet analysis.
Inspired from [32,33,64] the authors propose in this work the online Urdu
character recognition problem for context-dependent shapes of Urdu characters,
that is, for half-forms. For the development of online cursive Urdu handwriting
recognition system, recognition of half form Urdu characters is a primary step be-
cause of the following four reasons: First, Urdu characters appear in half-forms in
a word. Although full form letters are also used within a word, yet the role of half-
forms is much more than that of full forms. Second, half-form characters are the
building blocks for ligatures and therefore segmentation-based systems eventually
attempt to recognize the constituent half-forms [40, 45, 49, 50]. Third, there are
a lot more ligatures in Urdu, which cannot be entirely enclosed within the scope
17
Introduction
Table 1.1: Comparison of the proposed online Urdu handwritten character recog-nition method with Arabic work
Authors Type Character-Set ×
SamplesClassification Participants Accuracy (%)
Esam etal.* [54]
half-
forms
6033 RNN IfN/ENITDatabase
73%-80%
Jannoud [55] isolated,half-
forms
Not reported MLE Not reported isolated 99%,half-forms 90%-91%
Asiri andKhorsheed [56]
isolated,half-
forms
30×500 ANN Not reported for 3 differentsets of waveletcoefficients:74%,82%, 88%
Aburas andRehiel [57]
isolated 28×48 CodebookSearch, EDM
48 97.9%
Kour andSaabne** [58]
isolated,half-
forms
3145 BPNN,CorrelationClassifier, PNN
ADABDatabase
87%,89%,92%,95%
Proposedwork***
half-
forms
108×100 BPNN, SVM,RNN, DBN
100 (self-accumulated)
87.5%-100%
*with statistical, geometrical and Fourier descriptors features**with structural features***for Urdu characters with sensory input, structural, and wavelets
Table 1.2: Comparison of the proposed online Urdu handwritten character recog-nition method with Persian work
Authors Type Character-Set ×
SamplesClassification Participants Accuracy (%)
Mowlaei etal. [59]
isolated 32×190 MLP 200 92.3%
Broumandniaet el. [60]
Words 100×8-Rotationsof each word
MahalanobisClassifier
12 65% to 96%
Jenabzadeet al. [61]
isolated 33×200 MLP Not reported 86.3%
NasrollahiandEbrahimi[62]
sub-wordsof 4 fontsand 3 sizes
87804 PictorialDictionary, EDM
laser printer 97.9%
Vahid andSohrabi*[63]
isolated 4000 HMM 120-TMUData-Set
94.2%
Proposedwork**
half-forms 108×100 BPNN, SVM,RNN, DBN
100 (self-accumulated)
87.5%-100%
*geometrical features**for Urdu characters with sensory input, structural, and wavelets
18
Introduction
Table 1.3: Comparison of online Urdu handwritten character recognition
Authors Type Data-Set Approach Features Classification Participants Accuracy(%)
Malik etal. [30]
isolated 39 Analytical Structural Tree baseddictionary search
Notreported
93%
Malik etal. [30]
2-characterligatures
200 Analytical Structural Tree baseddictionary search
Notreported
78%
Hussainet al. [36]
Ligatures 300 Holistic Primitives Spatio-temporalartificial neuronsearch
2 85%
S. A.Husain etal. [37]
Ligatures 250 baseligatureswith 6sec-ondarystrokes
Holistic Syntactical BPNN 2 93% forbasestroke,98%for sec-ondarystroke
Shahzadet al. [31]
Isolated 152 Analytical Structural Linear classifier 2 92.8%
isolated 76 Analytical Structural Linear classifier 1 untrainednon-native
31%
isolated 76 Analytical Structural Linear classifier 1 trainednon-native
92.8%
Razzaket al. [38]
Ligatures 1800 Holistic StatisticalandStruc-tural
Hidden MorkovModel and FuzzyLogic
Notreported
87.6% forNastalique& 74.1%for Naskh
K. U.Khan etal. [32,33]
Isolated 3145 Holistic Structural BPNN,CorrelationClassifier, PNN
85 87%,89%,92%, 95%
Proposedwork***
half-
forms
108×100 Analytical Sensoryinput,Struc-tural,andWavelets
BPNN, SVM,RNN, DBN
100 87.5%-100%
of a single study. That is why, researchers in their works have tried to recognize
selective number of ligatures through which many words can be composed, but
not all. Consequently, such systems have limited vocabulary available for process-
ing [29, 38, 53]. Furthermore, for acquiring a valid ligature or finding an optimum
word, dictionary based search becomes a necessary part of the work [37], however,
this is not the case with the half-forms. Last, targeting half-forms would mean
independence from dictionary. Even new words not present in the dictionary can
be recognized.
Preference for online recognition system over offline systems is because
there are less efforts for the development of online Urdu handwritten character
19
Introduction
recognition. There remains much to explore in this broad field. Moreover, unlike
the static images in case of offline recognition, the dynamic information of the pen
movement recorded for online recognition aids to develop better, easier and faster
recognition algorithms. Advantage of digital pen is that it immediately transforms
handwriting into digital representation that can be reused later without having
any risk of degradation. Furthermore, storage space complexity can be reduced
with significant reduction in memory required by the online data [65] as compared
to scanned images. In fact because of these characteristics of online data, some
researchers have tried to superimpose pseudo temporal information and retrieve
writing order information for offline static handwriting images [66–69]. Table 1.3
shows a comparison between available research work with this proposed work.
1.10 Contributions
The main contributions of this work are as follows:
1. A framework for development of online Urdu handwriting recognition for
smartphones has been presented.
2. Based on the number of strokes in a character and the position and shape of
diacrtics, segregation of larger character-set into smaller subsets is obtained
through the proposed pre-classification in contrast to the previous online
Urdu character recognition approaches like [30, 31, 36–39, 53].
3. To cope with the demand of robust and accurate recognition along with
relatively low computational power and limited memory available to mobile
devices, banks of computationally less complex classifiers are developed, from
which the appropriate classifier would be loaded to the memory to achieve
the recognition task.
4. A comparison of different classifier-feature combinations is presented in this
study to exhibit the features’ discrimination capability and classifiers’ recog-
nition ability.
20
Introduction
5. A comparison of feature-based classifiers (Artificial Neural Networks (ANN),
Support Vector Machines (SVM)) and end-to-end classifiers (Recurrent Neu-
ral Networks (RNN), Deep Belief Networks (DBN)) is presented.
6. Noting the small databases of existing Urdu character recognition works
[31,36,38,39], a large database of handwritten Urdu characters is developed
and employed in this study, which contains 10800 samples of all Urdu half-
form characters (100 samples of each character) acquired from 100 writers.
The database can be obtained from the authors for the research purposes.
1.11 Thesis Organization
The thesis is organized as follows:
Chapter 2: This chapter discusses about Urdu character-set, rules to be followed
for Urdu handwriting, complexities attached to the shape/drawing of Urdu char-
acters and much more.
Data acquisition phase is narrated in Chapter 3. This chapter also describes the
hardware used for Urdu handwriting, graphical user interface (GUI) implementa-
tion for collecting Urdu handwriting samples, and development of Urdu digital ink
database. Moreover, data preprocessing is also given along with down-sampling,
algorithm for removal of repeated data points, and smoothing of signals.
In Chapter pre-classification is explored and results of pre-classification in the
form of Urdu character-subsets are accounted.
Features extraction (structural, sensory input values, and wavelets transform) is
described in Chapter 5. In Chapter 6 final classification of Urdu character-set
using different state of the art classification techniques is presented and results
are discussed.
Summary and conclusion of the research work is furnished in Chapter 7 along with
future recommendations.
21
Chapter 2
Urdu
Urdu is classified as Indo-European, Indo-Iranian, Indo-Aryan, Hindustani, and
Western Hindi language [70]. It is intelligible with Hindi, however it borrows its
formal vocabulary from Arabic and persian languages [70]. It follows right to left
writing system based on Perso-Arabic chirography. It is the Statutory National
Language of Pakistan (1973, Constitution, Article 251(1)) [71]. It is also constitu-
tionally recognized in India [72] and has official status not only in national capital
territory of Delhi but also in six Indian states (Bihar, Uttar Pradesh, Jammu and
Kashmir, West Bengal, Telangana, Jharkhand). Urdu is also a registered language
of Nepal [73]. It is a primary communication language of Pakistan populated with
about 200 Million people (Pakistan’s 2017 national census July 2017 est.). There
are about 70 Million native Urdu speakers in India [74]. The language is also spo-
ken and used in Bangladesh, Fiji, the Middle East, USA and many other countries
around the globe, including UK (having about 400,000 native Urdu speakers). In
North America and Canada, Urdu is the first language of 30 percent of the immi-
grants [75].
2.1 Urdu Character-Set
Urdu has a larger alphabet-set (character-set) as compared to Arabic (26 al-
phabets) and Persian (32 alphabets). How many letters are there in the Urdu
alphabet-set? The answer is a controversy. Urdu is an Indo-Aryan (Indic) lan-
guage [76]. Although it follows Perso-Arabic script yet to conciliate the require-
22
Urdu
��������������������� ������������������������������������� �!�"�#
$�%�&Figure 2.1: Urdu alphabets (fundamental)
����
�
� ���� ��������������
������������
� �
Figure 2.2: Alphabets added to fundamental Urdu alphabets to cope with phoneticpeculiarities
ments of phonetic peculiarities especially aspiration, retroflexion and nasalization,
Urdu alphabet-set has been suitably modified. The 37 fundamental alphabets
are shown in Figure 2.1. The shapes of fundamental alphabets are said to be
isolated-forms or full-forms (half-forms of Urdu alphabets are discussed below in
section 2.1.4). The alphabets added to the fundamental set to meet the phonetic
peculiarities are given in Figure 2.2. National Language Promotion Deprtment
(NLPD) [77], the official authority responsible for taking measures to implement
Urdu as an official language in Pakistan, has declared that Urdu has exactly 58
letters including all letters denoting aspirant sounds. However, many may take
exception to this declaration. Moreover, there is a kind of tacit consensus on the
total number of letters in Urdu alphabet and it is generally believed that Urdu has
23
Urdu
36, 37, 38 or at the most 39 letters [29, 33]. The difference is due to the addition
of some alphabets to the basic Urdu alphabet-set as narrated above.
2.1.1 Urdu Diacritics
In linguistic, a mark which is added to a letter to indicate a special pronunciation
is called a diacritical mark or simply a diacritic. A diacritic is also known as a
minor stroke of a letter. In Urdu, there are five different types of diacritics or
minor strokes. These are:
• nuqta or dot ( � )
• towey ( ¢ )
• inverted hay ( � )
• hamza ( � )
• kash ( � )
An Urdu alphabet is shaped by drawing a major stroke with none, one-, two-, or
three- minor strokes. A few examples of Urdu alphabets with diacritics (minor
strokes) are shown in Figure 2.3.
2.1.2 Single and Multi-Stroke Characters in Urdu
Depending on the alphabet, a major stroke may have diacritic(s) above, below
or even inside the stroke. Number of diacritics with a given major stroke defines
an alphabet as a single-, or a multi-stroke character. A multi-stroke character
consists of two-, three- or four- strokes in total. See Figure 2.3.
2.1.3 Word-Breakdown Structure in Urdu
Word -breakdown structure in Urdu renders that words in Urdu are formed by
using one of the following ways:
• Two or more full-forms placed together, that is to say that no half-form is
used at all (Figure 2.4a).
24
Urdu
Major Stroke
One Minor Stroke
Two Minor Strokes
Three Minor Strokes
No Minor Stroke
������
�
�� �
��
�
Figure 2.3: Examples of Urdu (fundamental) alphabets with major and (none,one-, two-, or three-) minor strokes
• Two or more half-forms joined together to form ligatures and then these
ligatures are placed together to construct an Urdu-word. (Figure 2.4b). No
full-form is involved in such compositions. A word may have a single-ligature
or may be composed of multiple-ligatures.
• Full-forms placed with ligatures to form Urdu-words (Figure 2.4c). Full-
forms appearing in such type of words keep themselves detached with liga-
tures (Figure 2.4d).
2.1.4 Half-Forms of Urdu Alphabets
Figure 2.5 shows another set of Urdu characters. These are half-forms of fun-
damental Urdu alphabets. In other words, alphabets shown in Figure 2.1 are
full-forms of the alphabets given in Figure 2.5. These are 108 in number. There
are (mainly) three different types of half-forms as explained below:
• Initial Half-form : When a character occurs in the beginning of a
word/ligature it adopts its initial half-form. Not every character has an
25
Urdu
���������������������������� �������
(a) Urdu-words formed by using full-forms
�� � ���� �
��
2-letter Ligature
word composed of two
ligatures � � �
�
4-letter Ligature
single-ligature
� ��
2-letter
Ligature
(b) Urdu-words formed by using ligatures
�� � ��� � �� �Ligature formed with 4 letters
Isolated Character
��� �� ��
�
Ligature formed with 2 letters
Isolated Character
� � ��� �
(c) Urdu-words formed by using isolated-forms and ligatures
����
����������� ����������������������������������������������������� ������������������������������������������������������������� ��������������������������������� ������������������������������������������������������������������������������������������������������������������ ����������������������������������������������� ��
�������������������������������� �������������������������������������������������������������� ���������
� and � are joined together to form �
Isolated-forms/Detached letters
(d) Joined and detached letters in Urdu-words
Figure 2.4: Constructing the Urdu-words
’ ‘ — “ O N P M „ ƒ ‚ … K J I LBh • f c b a\ ZYˆ ‡ † W VU S R Q T| zyw v u sr q o n m ‹ l ™ j µ ´ ³± ° ¯ ¬ « © ¨ §¥ ¤ £ ñ ¡ ø
Å Ä Ã Á À ¿ ½ ¼ » { ô õ ò Œ ¸ º¹ ö · š ÓÎ Í � ˙ Ú × Ö é è ú É È ä ì Ì
Figure 2.5: All Urdu characters in all half-forms.
initial half-form and there are characters which have more than one initial
half-forms. Initial half-forms are 36 in number. (see Figure 2.6a)
26
Urdu
• Medial Half-form : When a character falls in between two characters to
compose a word/ligature it takes its medial half-form. Medial half-forms
are 30 in number (see Figure 2.6b). There are characters which do not have
medial half-forms. There are also characters that have more than one medial
half-forms.
• Terminal Half-form : When a character appears at the end of a
word/ligature it appears in its terminal half-form. Terminal half-forms are
very much similar in shape to the respective full-forms. There exists 42
terminal half-forms of different Urdu characters. Round-Hay is a character
which has two different terminal half-forms (‘É ’) and (‘ú ’). Figure 2.6c
shows terminal half-forms of Urdu characters.
Use of half-forms is shown in Figure 2.7. Figure 2.8 shows handwritten half-forms
combined together to form different words.
2.2 Urdu Fonts: Where do these Half-Forms come from?
Urdu fonts, like many other fonts are particular styles of typography with some
particular size and weights. In this era of digital typography, the word font is
similar to what the typeface is in metal-typesetting. Urdu follows Arabic and
Persian scripts (writing-styles). There existed and exists different writing-styles
for Arabic and Persian like Kofic, Andalusi-Maghribi, Muhaqqaq, Rayhani, Towqi,
Riqa’, Thuluth, Naskh, Ta’liq, Nastalique, Shikaste, Divani etc. Each script has its
own rules for writing and induces distinct visual characteristics. These writing-
styles were developed and used for different motives and needs. These were also
opted for Urdu writing.
Urdu-writing is an essential part of learning skills at primary level in native
population. The art of writing in different styles has highly been nourished by the
calligraphers. Figure 2.9 shows Urdu typographic transcribe in five different fonts,
that is, Nastalique, Naskh, Kasheeda, Thuluth, and Andalusi-Maghribi.
27
Urdu
Single-Stroke
à ö · « ¯ £ y a U
å Ö ì ä
³ M PÍ õ Œ
T ‚ …q † Q
u m Y¿ » § ø
2-Strokes
— “ I L
3-Strokes
4-Strokes
(a) Initial Half-Forms
bV ‘ J
´ N
R ƒ
¡ v n Zè À ¼ ¨
° ¬ ¤zÈ Ä ¹ º
Î ô ò
r ‡
Single-Stroke
2-Strokes
3-Strokes
4-Strokes
×(b) Medial Half-Forms
Single-Stroke
2-Strokes
3-Strokes
4-Strokes
š Óé ú
jf \ B© ñ woÉ Ì Á ½
c W ’ Kl ™ h •
± ¥| ˛ Ú Å ¸
Ł
−
�
µ O � {
S „s ‹ ˆ
(c) Terminal Half-Forms
Figure 2.6: Single- and multi-strokes half-forms of Urdu Characters
2.2.1 The Nastalique Font
The Nastalique writing-style is accustomed among Urdu natives. Primitively it
was devised for Persian script, that is why it is also known as farsi font in Arab
28
Urdu
��� �
«
I
� ��
† õ
�³
(a) Use of initial half-forms
���������������
¤
NÈ
r
V
°(b) Use of medial half-forms
������ �� � ��� � �� �
ú ¸
j Á
Å
O
(c) Use of terminal half-forms
Figure 2.7: Examples of words composed of half-form characters.
Figure 2.8: Examples of words composed from (segmented) handwritten half-formcharacters
world. Besides Persian, Nastalique is also used for writing Sindhi, and Punjabi,
two regional languages of Pakistan and India. It is a hybrid of Naskh and Ta’liq
fonts. The Naskh font is the most common font in printed Arabic because of high
legibility. It is also used to write Pashto script (Pashto is a language of about
40-60 million people around the world [78]). The Ta’liq font gives a visual effect
of letters hanging together or suspended from a line while the descending strokes
are shaped as loops. Taking the legibility and grace of both Naskh and Ta’liq,
Nastalique font is visually beautiful, eminently precise, and inherently intricate
font. Mainly, it characterises the presence of diacritic(s) and superposed marks.
Due to beauty and gracefulness of the trace, it has been used for writing royal
messages, letters, albums of calligraphic illustrations, and poetry etc. Since its
birth and introduction it has widely been used for literary works in Iran and sub-
29
Urdu
�����������
�سخ(Naskh)
����������
پتی پتی آنگن سارا دن ایک کہ تھا یقین اسے ہوئے سنبھالتے تبرکا
تھی۔ جانتی مفہوم کا تبرک وه گا؛ اٹھے مہک
� � �� �� �� �� � � � �� �� ��� �� !� "# �$ %� � &' () *+
:
:
:
���
�,�-.-/���� ��� ��� �� ����� ���� ��� ���� ��� � ��� �� ���� �� ! �"# ��$� %�& '( )� � *�"+# �� �, -���
ايک هک اھت يقين ےاس ےوئه ےالتھسنب تبرکا پتی پتی
۔یھت جانتی ومهمف کا تبرک هو گا؛ ٹھےا کهم آنگن سارا دن
�. / �- �234# �01�.-�234��/�51
:
:
� � ��� �� �� �� � � � ��� �� �� !" # $� %& '( )� � *+ ,- ./
�
Figure 2.9: Different Urdu fonts
continent. The font stood the test of time, acknowledged the necessity of time
and responded well to the needs of people in a widespread. Although difficult to
execute yet the font was opted for use in routine Urdu writings by public at large.
Nastalique font is used to teach writing skills at early stages of education. And
now, it will be no wrong to say that Nastalique font is not only the preferred choice
of writing but has actually been ingrained in native Urdu knowers. For printed
Urdu, a computerized version of the font was created by Mirza Ahmed Jameel
in 1980, naming it as Noori Nastalique [79]. It is a less elaborated adaptation of
Nastalique writing-style, however it completely follows the basic characteristics of
the style [28]. For Urdu writing, the difference between Naskh and Nastalique is
significant as shown in Figure 2.9. In Nastalique, ligatures are oblique whereas in
Naskh ligatures are linearly placed. Analyzing the shapes of letters, for Nastalique
we observe that the variations in shapes of Urdu characters according to their
position in a ligature or a word. Although variations in shapes of Urdu letters
can be seen in other writing-styles too (such as in Naskh) yet these are more pro-
nounced in Nastalique. Moulding of a letter/character into a shape different from
fundamental shape (or full-form) is called context dependency of Urdu characters.
30
Urdu
Half-forms are actually these moulded shapes, born because of writing-styles and
further elaborated by Nastalique.
2.2.1.1 Characteristics ofNastalique Font
In the this section, some characteristics of Nastlique writing-style are narrated.
Context Dependency: In Nastlique font, each character modifies its fundamen-
tal shape as per context in which it has been written. For a given character
there are three possible contexts; the context of beginning, the context of
middle, and the context of final. When a character has to be written in
the beginning of a ligature or word, generally, it will take initial half-form.
When a character is required to be placed in between two letters, that is
in the context of middle, it will be penned in its medial half-form. When a
character appears at the end of a ligature or word, that is in the context of
final, it is shaped in terminal half-form. Pictorially context dependency is
shown in Figure 2.10a for Urdu character ‘� ’ and in Figure 2.10b for Urdu
character ‘ � ’. As stated in subsection 2.1.4, a character may have more
than one half-forms. Now which half-form a given character would opt for
depends upon the preceding and subsequent characters to which the given
character has to be connected [80]. For example see Figure 2.10c. However,
for some characters exceptions lie in general contextual dependent behavior.
It is described as an Assertion as follows:
Assertion-1: Alif (‘��’), Dal (‘�’), Ddal (‘�’), Zal (‘�’), Ray (‘�’), Aday
(‘�’), Zay (‘ ��’), Zhay (‘�’), and Wao (‘�’) are such characters in
Urdu character-set when written in the context of beginning keep
their isolated- (fundamental- or full-) form intact and do not attach
themselves to the subsequent character.
Implication-1: The above mentioned Assertion-1 implies that whenever
the characters mentioned in Assertion-1 will occur at the terminal of
a ligature of a word, the subsequent character will either be in its full-
form if it is the last character of the word, or will be in initial half-form
if it is not the last character of the word. See Figure 2.10d.
31
Urdu
Baseline: In every script, there is a horizontal line which serves as a baseline for
drawing characters. A baseline is a horizontal line to which all the characters
in a word and all the words in a sentence touch at some point. Naskh font
gives very good example of this concept see Figure 2.11a. However, this is
not the case with Nastlique font. In other words, while following Nastlique
font, there is no horizontal or vertical physical line which could cut all the
characters in a word at some point [29], [37]. The reason for the absence of
such a line is the tilting factor induced by Nastlique writing-style in Urdu
characters.
Atilt Half-forms and Ligatures: In Nastlique font, usually half-forms and
ligatures commence from some top-right location and finally rests on some
bottom-left point with few exceptions. This kind of writing comes up with a
tilt in ligatures. Characters place themselves along a diagonal and therefore
drawing a baseline that could touch all the characters in a word at some point
is not possible for this font. This arrangement along the diagonal also brings
forth varying heights and widths of ligatures. Moreover, the allowed tilting
does not mind stacking any number of characters in a ligature. Character
by character stacking just increases the slope of some ligatures as compared
to the ones which have less number of adjoined characters [81]. See Figure
2.11b.
Cursiveness: Urdu writing-style is actually cursive in its very nature. Cursive-
ness is further flourished with the evolvement of Nastalique font. It is the
property of rapid writing in which successive letters within a ligature or word
are connected together without lifting the pen from the writing surface. It
fosters the flow in writing.
Character Thickness: Characters written in Nastlique style vary in their thick-
ness as well. In a single character, variations in the stroke-thickness are
clearly observable in Figure 2.11c. Variation in the thickness can be wit-
nessed only when a flat tipped pen is used for writing. With a round tipped
pen this is not present.
32
Urdu
�� � �� �� � � � ��
�
Context of Beginning:
Initial half-form
IK
J
Context of Final:
Terminal half-form
Context of Middle:
Medial half-formFull-form
(a) Varying shapes of Urdu character ‘� ’
� � ��� �� � � �
ä
È
ú Context of Beginning:
Initial half-form
Context of Final:
Terminal half-form
Context of Middle:
Medial half-form
(b) Varying shapes of Urdu character ‘ � ’
Two different initial half-forms of � connected to two different letters � and �
Two different medial half-forms of � connected to two different letters � ��� �
� ����� (c) Two different initial/medial half-formsof ‘� ’
���������� �������� �� �� ��� �� �� ��� �� �� ��� �� ��� �� �� ��� ��� � �
�� ��
�� �t the terminal of ligature ���
next ligature � � begins with
initial form of �
� � �t the middle of word ���� the
ligature � begins with initial
form of �
� � �t the beginning of ligature �� the ligature � begins with
initial form of �
�� �� ��� �� ��� �� �� ��� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� ���� �� �� �� �� � �� � at the terminal of ligature �� the last letter � of the word is in
full form
(d) Assertion-1
Figure 2.10: Context dependency
Stroke Analysis: Along the horizontal axis, generally, strokes in Nastalique are
broad and sweeping while along vertical direction the strokes are shorter.
An example is shown in Figure 2.11d.
Visual Impression: Nastalique prevails an impression of expeditious flow in
writing. Characters seem to be floating or hanging across the page, es-
pecially when the text is arranged diagonally.
2.3 Idiosyncrasies of Urdu-Writing
Regardless of the font used or style adopted, Urdu writing has a mode of behaviour
that is peculiar in different aspects as given below:
Course of writing/Direction of Text: In Urdu, the direction/course of writ-
ing (and reading too) depends on whether the piece of text is alphabetic
or alphanumeric. Alphabetic text is written from right to left direction. In
alphanumeric text, digits also follow the course from right to left direction.
33
Urdu
Nastalique:
Naskh:
������������������
دل و نگاه مسلماں نہیں تو کچھ بھی نہیں
(a) Baseline
���� ����������� ����� ������� ����� ���
Increase in tilting for Nastalique
(b) Atilt ligatures
Á Thick Part
Thin Part
Thick Part
(c) Thickness variation in an Urdu strokewritten in Nastalique font
K
Drawn horizontally
(from right to left)
�� »Drawn vertically
(from up to down)
(d) An Urdu ligature with horizontallybroad and vertically short character
Figure 2.11: Distinct features of Nastalique font
However, the numbers appearing in an alphanumeric text are penned from
left to right [46]. See Figure 2.12a.
Course of the Stroke: Mostly, the strokes of Urdu characters do not go just
along a single direction [29]. For a particular character, for example ‘�’ the
stroke starts from the left, comes up, turns right, then coming down to left,
makes a curve while going towards right, or for ‘�’ the stroke starts from
the top right, comes down, goes upward while making a curve. See Figure
2.12b.
Ligatures Overlap: There occurs two types of overlapping in Urdu ligatures [28]
whether the ligatures are of the same word or belong to consecutive words
of the same sentence. First is intra-ligature overlapping in which characters
of the same ligature extend over each other so as to cover some portions
of the neighboring character. The second is inter-ligature overlapping in
which terminating character of preceding ligature partly covers the beginning
character of the subsequent ligature or vice versa. The inter-ligature overlap
also occurs between adjacent ligatures of two different words, that is actually
inter-word overlap. See Figure 2.12a. Overlapping is practised to obviate
the unnecessary white-spaces, however the characters do not make contact
with each other [29].
34
Urdu
Position and Count of Diacritics: In Urdu alphabets, there are 5 different di-
acritics (see subsection 2.1.1). The diacritic towey, (‘¢’), hamza, (‘�’), and
kash, (‘�’) always take a place above the major stroke. Diacritic inverted hay,
(‘�’) always occupies a place below the major stroke. The nuqta or dot, (‘�’)
may be found above, below, or inside the major stroke depending upon the
alphabet. Besides the position taken, the count of the diacritics also varies
from alphabet to alphabet. For a multi-stroke character, towey, (‘¢’), hamza,
(‘�’) and inverted hay, (‘�’) only occur in two-stroke alphabets. Diacritic kash,
(‘�’) is used to form ‘one two-stroke’ and ‘one three-stroke’ alphabets. In case
of three-stroke alphabets two kashes, (‘ ’) are used. The count for nuqta or
dot, (‘�’) is 1 for two-stroke alphabets, 2 (‘��’) for three-stroke characters, and
3 (‘��� ’) for four-stroke characters.
��������� ��� ���� �����������
��������� ����� � �!��"�#$�%�&��'
(��������)*%�+,�-.� -/0�����1�
(13)(14)(47)
inter-ligature
intra-ligature
Overlap
inter-word
overlap
(a) Alternating course of writing, and liga-tures overlapping
� ���(b) Stroke directions for Urdu charactersand words
° l W V I
r ˆ T ‚ Î ´ M
· “
Dots or Nuqtas: Above/Below/
Inside the major stroke
È × {
Diacritic other than dots:
Above/Below the major
stroke
(c) Urdu diacritics
��
��
� � �
��
(d) Above and below placement of nuqtasfor Urdu words
Figure 2.12: Idiosyncrasies of Urdu writing emphasizing ligature overlap, writingdirections, and placement of diacritics
Knottiness of Dots or nuqtas: As narrated previously that the minor stroke
dot or nuqta can occupy a place above, below or inside the major stroke. But
35
Urdu
Unfilled loop for �Filled loop for �
(a) Handwritten �
Unfilled loops
(b) Loops in in handwritten version
��� ������������� ����
������������
Filled loop Filled loopUnfilled loops
Filled loop
(c) Loops in typographic version
Figure 2.13: Idiosyncrasies of Urdu writing emphasizing presence and character-istics of loops in different writing styles
�������������� � � �������� ����
����
Initial half-form Y with a loop without a loop
(a) Loop: to be or not to be
False loop
Initial half-form Y
handwritten version
Recommended
(b) Occurrence of false loop
Figure 2.14: Idiosyncrasies of Urdu writing emphasizing presence or absence ofloop in the same character penned by different hands
36
Urdu
will this placement be exactly above/below/inside, or above/below/inside-
right, or above/below/inside-left of the major stroke as shown in Figures
2.12c and 2.12d? The answer to this question depends on how much the
ligature has gone diagonal and how the writer thinks to place the nuqtas.
Moreover, in case of multi-dots (two, or three nuqtas), the dots may or may
not adjoin each other. It is again the writing-hand’s choice how to draw the
nuqtas.
Small Loops: Some of Urdu characters have small circular or oval shaped loops
in their strokes. Mostly these are hollow for handwritten characters with
some exception as shown in Figure 2.13a and 2.13b, but may or may not
be filled for typographic scripts for example see Figure 2.13c. Sometimes
in handwritten characters there occurs some delusive loops. Such loops do
not lie in the standard stroke but happen to occur due to the writing-hand
writing the character. See Figure 2.14.
37
Chapter 3
System Description
Proposed online handwritten Urdu character recognition system begins from data
acquisition, goes through some pre-processing algorithms to get prepared for some
pre-classification and features extraction phase. It ends up with characters’ final
classification. The block diagram of the whole system is shown in Figure 3.1.
Data acquisition and preprocessing are described in the following while the rest is
accounted in upcoming chapters.
3.1 Data Acquisition
Handwriting data can be acquired using a pen-tablet device connected to a com-
puter. Data may also be acquired by writing on the touch sensitive screen of a
smartphone. In our study, 100 native Urdu writers of different age groups have
provided their handwriting samples using a stylus and digitizing tablet.
3.1.1 GUI: Writing Canvas
A Wacom tablet is used to collect handwritten samples for Urdu characters in
their half-forms. For this purpose, a graphical user interface (GUI) is developed
in Visual C# programming language using wintab.dll. The interface connects to
the Wacom tablet and provides an on-screen writing canvas. Respective half-form
(initial, medial, or terminal) is selected from a dropdown menu and with the help
of stylus the character is drawn on the writing canvas. To visually aid the writer,
upon selection of the half-form the actual shape of the character and an exemplary
38
System Description
Preprocessed Data
Extracted
Features
Input Data
Character subsets
Yes
No
Sole member
of the subset,
therefore no
need to feed to
the classifier
Output
Writing on
Tablet SurfaceInterface
Urdu
Half Form
Database
Feature Extraction
Subset I Subset II Subset n…………Subset III
Pre-Classification
(No. of Strokes, Minor Stroke Position, Minor Stroke Shape)
Down Sampling Smoothing
Cardinality
> 1
Recognized Half Form
Characters
Decision of Pre-
Classifier for selection
from Classifier Bank
Classifier
I
Classifier
II
Classifier
n....
Classifiers Bank
Or Or
Preprocessing
Data Acquisition
Figure 3.1: Block diagram of the proposed Online Urdu character recognitionsystem: from data acquisition to preprocessing to pre-classification to featureextraction to final classification.
word which uses respective half-form character also appear on right side of the
canvas. The canvas is depicted in Figure 3.2. Figure 3.3 shows an Urdu word
written on the canvas.
3.1.2 Information in Handwritten Character-Signal
Online handwritten character-signals contain the information of the digitized coor-
dinates (x(t), y(t)), the pressure values and time-stamps for each point (x(t), y(t)).
39
System Description
Connect to the
tabletDropdown Menus
Enable the
writing area
Writing AreaUse of selected
character
Shape of selected
character
Figure 3.2: Writing interface for digitizing tablet
During the data acquisition, the following attributes of character strokes were ac-
quired:
1. Number of times the pen gets up or down.
2. Number of strokes used to draw a character.
3. Starting/ending index of each stroke.
4. Temporal order of each sample of (x(t), y(t)) coordinates.
40
System Description
Figure 3.3: An Urdu word is written on the canvas with the help of a stylus andtablet
5. Pressure value at (x(t), y(t)). Note: pressure value are utilized in this work
only for detecting pen up/down events.
3.1.3 About the Data
The data obtained from the writers is in segmented form. Figure 2.8 (from Chapter
2) shows few examples of full Urdu words and ligatures composed from the seg-
mented characters obtained from the participating writers, and demonstrates that
the words composed from these segmented characters do resemble the words as if
written continuously. To use a recognition system based on our proposed method
in its current form, it is required to draw the characters in their segmented forms.
If the visual feeling of continuous word is required then the segmented characters
should be drawn at appropriate positions as shown in Figure 2.8 (Chapter 2). We
are also working on segmentation of characters from ligatures and will be reported
in future. A related work on segmentation of handwritten Arabic text can be
found at [82] that presents an efficient skeleton-based grapheme segmentation al-
41
System Description
gorithm. With some modifications, this segmentation algorithm along with our
proposed methodology may serve as a full system for online Urdu handwriting
recognition. Segmentation of printed Urdu script can be found in [40, 49, 50].
3.1.4 Instructions for writing
For non-native audience, here we present some instructions that should be followed
while writing Urdu characters. These instructions are implicitly followed by native
Urdu writers.
• There should be no pen-up event while drawing the major stroke i.e. the
major stroke should be drawn continuously without raising the pen,
• In case of multi-stroke characters, the major stroke should precede the minor
stroke(s), and
• Minor strokes should be penned one at a time, i.e. there must be pen up
events between two or three dots or between two ‘kashes ’. In some cases,
this instruction is violated by the native writers, but for this work, we stress
on following this instruction.
Two minor strokes drawn together (for example, two dots) can be separated using
the variation in pressure values.
3.2 Character Database
There is availability of Arabic handwritten words database (Arabic DAtaBase:
ADAB [83] ) but for Urdu there is a lack of standard handwritten character
database. Using the above GUI a large database of Urdu handwritten characters
(in half-forms) has also been accumulated during this study. For this purpose,
samples for Urdu characters have been hand written by 100 different persons (all
natives). The handwritten sample contributors, both males and females, belong
to different age groups and different literacy competence. Most of the contributors
used the stylus and digitizing tablet first time. Each of them wrote 108 characters
on the tablet. This created a database of 108x100 Urdu characters in their half-
42
System Description
forms where each character-signal is saved in a binary file format. The database
can be provided for research purposes.
(a) Example-1: An Urdu character in initial half-form is written onthe canvas with the help of a stylus and tablet
(b) Example-2: An Urdu character in initial half-form is written onthe canvas with the help of a stylus and tablet
Figure 3.4: Examples of handwritten character using stylus and digitizing tablet
43
System Description
(a) Example-3: An Urdu character in medial half-form is written onthe canvas with the help of a stylus and tablet
(b) Example-4: An Urdu character in terminal half-form is written onthe canvas with the help of a stylus and tablet
Figure 3.5: Examples of handwritten character using stylus and digitizing tablet
3.2.1 Handwritten Samples
In this subsection different samples of handwritten characters are given. These
samples are obtained from native Urdu writers of different age groups. Figure 3.4
44
System Description
Figure 3.6: Examples of handwritten character using stylus and digitizing tablet
to 3.6 show a few handwritten samples. Figure 3.7 shows all Urdu characters in
their half-forms written by a user with the help of stylus and digitizing tablet.
3.3 Preprocessing
The raw data obtained from the hardware contains artifacts like jitters, hooks in
start and end of a stroke, speed variations etc. To reduce the effect of artifacts,
the following preprocessing steps have been performed on the raw data.
3.3.1 Re-Sampling
Algorithm 1 has been implemented to remove repeated data-samples. Repeated
data-samples are points which occur consecutively in temporal order. Actually,
due to varying handwriting speed of the writers, the acquired spatial sampling rate
does not remain constant. Since there is a constant temporal data rate for a tablet,
45
System Description
Figure 3.7: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet
a large number of samples are generated in the regions where the writing speed is
slow. It usually occurs at the beginning of a stroke and around the corners as well.
These large number of data-samples at the specific locales can produce several
samples at the same (x, y) location and may steer towards erroneous values in the
course of features extraction. To eliminate such multiple samples, repeated data-
samples are taken away from the signal recorded for each character. Afterwards
a down-sampled version of this signal is obtained by keeping every second data-
sample starting with the first. Few samples of down-sampled data are shown in
Figure 3.8.
3.3.2 Smoothing
Drawing on a tablet by inexperienced users, or roughness of pen tip or writing
surface may result in jitters and trembles in writing [53]. To mitigate jittering
effects the character data is smoothed using a 5-point moving average filter given
46
System Description
by the following difference equation:
ys(i) =1
2N + 1(y(i+N) + y(i+N − 1) + ... + y(i−N)) (3.1)
where ys(i) is smoothed value for the ith data point, N is the number of neigh-
boring data points on either side of ys(i) (in this case N=2), and 2N + 1 is the
span. The results of smoothing function are shown in Figure. 3.9.
Algorithm 1 Repeated elements removed
1: procedure RemoveRepeatedDataPoints(S)
⊲ S(M × 2) contains X and Y coordinates of a given stroke
2: initialize k ← 1
3: initialize Sr(k)← S(1)
4: for i = 2 to M do
5: if ||(S(i− 1)− S(i)||2 = 0 then
6: Sr(k)← S(i)
7: else
8: k ← k + 1
9: Sr(k)← S(i)
10: end if
11: end for
12: return Sr
13: end procedure
47
System Description
Original DataDown−Sampled Data
(a) Down-sampling of character ‘q’
Original DataDown−Sampled Data
(b) Down-sampling of character ‘«’
Original DataDown−Sampled Data
(c) Down-sampling of character ‘· ’
Original DataDown−Sampled Data
(d) Down-sampling of character ‘|’
Original DataDown−Sampled Data
(e) Down-sampling of character ‘‘’
Original DataDown−Sampled Data
(f) Down-sampling of character ‘W’
Figure 3.8: Re-sampling and Down-sampling of characters
48
System Description
Original DataSmoothed Data
(a) Smoothing of character ‘…’
Original DataSmoothed Data
(b) Smoothing of character ‘†’
Original DataSmoothed Data
(c) Smoothing of character ‘Œ ’
Original DataSmoothed Data
(d) Smoothing of character ‘ ì ’
Original DataSmoothed Data
(e) Smoothing of character ‘q’
Original DataSmoothed Data
(f) Smoothing of character ‘Q’
Figure 3.9: Smoothing of Urdu Handwritten Samples
49
Chapter 4
Pre-Classification
Urdu characters can be grouped into subsets based on the similar major stroke
and differ from each other due to their minor strokes. These similar characters
pose difficulty in classification. To reduce the difficulty level in classification, a
novel concept of pre-classification is presented here. The pre-classifier classifies the
characters into small subgroups. The classification criterion is derived from the
properties of Urdu characters presented in Chapter 2. For online data acquisition,
information about the character becomes available to us as soon as it is written
on the tablet surface. This fact is exploited to pre-classify the Urdu character-set.
4.1 Pre-Classification of Half-Forms
The pre-classification for initial half-forms is explained here, whereas the pre-
classification of medial and terminal half-forms is similar.
Phase-I: In the first phase of pre-classification, the character-set is divided into
different groups on the basis of number of pen-up events. The number of
pen-up events actually represents the number of strokes in a character. On
the basis of stroke-count (SC) following four subsets are yielded (see Figure
4.1).
• The subset of single-stroke characters. It contains 7 characters
• The subset of two-stroke characters. It contains 17 characters
50
Pre-Classification
Urdu Initial Half Form Characters
Single Stroke
Two Strokes
Four Strokes
ThreeStrokes
³P M Œ õ
q T Q †… ‚
U L I
U L I ä ì£ y a
à ¯ «
ö “— å Ö ·
Œ õ ³P M Í
Œ õ³P M
u Y
¿ » § ø m
Q … ‚
q †T Í· ö ¯ «
“ — L I £ y a U
ì ä Ãå Ö
« £ y a“ — å Ö Ã · ö ¯
ì ä
Figure 4.1: Pre-classification of initial half-forms on the basis of stroke count,position and shape of diacritics
• The subset of three-stroke characters. It contains 6 characters
• The subset of four-stroke characters. It contains 6 characters
Phase-II: In the second phase of pre-classification, on the basis of position of
the diacritic(s), every multi-stroke subset obtained in the Phase-I is further
segregated into two sub-subsets. In initial- andmedial half-forms the position
of the diacritic(s) for multi-stroke Urdu characters is either above, or below
the major stroke. For terminal half-forms the position of the diacritic(s) is
either above, below, or inside the major stroke. For this study, for the sake
of simplicity, diacritic placed inside the major stroke in terminal half-forms
is considered as the diacritic placed above the major stroke. At the end of
the Phase-II of pre-classification, we get 6 sub-subsets of multi-stroke initial
half-form characters.
Phase-III: For third phase, the shape of diacritics helps for further segregation
of sub-subsets obtained in Phase-II. For Urdu characters, the diacritics can
be divided mainly into two genres on the basis of their shapes.
51
Pre-Classification
Urdu Medial Half Form Characters
Single Stroke
Two Strokes
Four Strokes
R ƒ
ThreeStrokes
´ N
È V J
È
Î
b V ‘ J° ¬ ¤ zÄ º¹ × È
º¹ ° ¬ × Ä
¤ z b ‘
ô ò Î
´ N ô ò
´ N ô ò
r ‡
r R ‡ ƒ
º¹ × ‘ ¤ zb
Ä ° ¬
V J
n Z ¨ ¡ v è À ¼
Figure 4.2: Pre-classification of medial half-forms on the basis of stroke count,position and shape of diacritics
• nuqta or dot, (‘�’) diacritic.
• Other-than-dot diacritics. These are towey, (‘¢’), inverted hay, (‘�’),
hamza, (‘�’), and kash, (‘�’)
The simplest way to to differentiate between the dot diacritic and other-
than-dot diacritics is the length of the shape. On inherent basis, the length
of the dot is found shorter in length than any other-than-dot diacritic. This
fact helped for further classification. The sub-subsets obtained in second
phase are further divided, where possible, to produce 10 sub-sub-subsets of
characters in initial half-forms. These final sub-sub-subsets are the terminal
leaves of the pre-classification trees shown for initial-, medial-, and terminal
half-forms in Figures 4.1, 4.2, and 4.3 respectively.
4.2 Results of Pre-Classification
As a result of pre-classification, finally we get 10, 10, and 8 subsets for initial-,
medial-, and terminal half-form characters respectively. These are:
52
Pre-Classification
Urdu Terminal Half Form Characters
�������
����
w o j
� �
�����
���
�����
����
�����
KNo character
with secondary below the
primary stroke
f \ B
Óé
½ © ñ
ú
É Ì Á Ł ˛ Ú
c W ’ Kl ™ h • �
−� ¥|Å ¸ ±
• c W ’| l ™ h
Ł ˛ Ú Å¸ ±¥ −�
� {µ O S „
s ‹ ˆ
„s ‹ˆ S
� {µ O
µ O�
{
h c W|l−�
Å ±¥
™ • ’Ú ¸ Ł ˛
Figure 4.3: Pre-classification of terminal half-forms on the basis of stroke count,position and shape of diacritics
Initial Half-forms: All initial half-forms are pre-classified as follows:
1. Single-stroke characters (7 characters)
2. Two-stroke characters with dot diacritic above the major stroke (6 char-
acters)
3. Two-stroke characters with other-than-dot diacritic above the major
stroke (6 characters)
4. Two-stroke characters with dot diacritic below the major stroke (3 char-
acters)
5. Two-stroke characters with other-than-dot diacritic below the major
stroke (2 characters)
6. Three-stroke characters with dot diacritic above the major stroke (3
characters)
7. Three-stroke characters with other-than-dot diacritic above the major
stroke (2 characters)
53
Pre-Classification
Table 4.1: Pre-classification of Urdu character-set. The encircled numbers indicatethe cardinality of final stage subsets that could be obtained with the help of theproposed pre-classifier
Subset Numberof
Char-actersin
subset
Division on Minorstroke position
w.r.t major stroke(Above/Below)
Number ofcharacters insub- subset
Division onDiacritic Type(dot/other-than-dot)
Number ofcharacters in
sub-sub-subset
Initialhalfform
s(36ch
aracters)
Single-Stroke 7 × × × ×
Two-Stroke 17Above 12
dot 6
other-than-dot 6
Below 5dot 3
other-than-dot 2
Three-Stroke 6Above 5
dot 3
other-than-dot 2
Below 1 × ×
Four-Stroke 6Above 3
dot 3
other-than-dot ×
Below 3dot 3
other-than-dot ×
Med
ialhalfform
s(30ch
aracters)
Single-Stroke 8 × × × ×
Two-Stroke 13Above 10
dot 6
other-than-dot 4
Below 3dot 2
other-than-dot 1
Three-Stroke 5Above 4
dot 2
other-than-dot 2
Below 1 × ×
Four-Stroke 4Above 2
dot 2
other-than-dot ×
Below 2dot 2
other-than-dot ×
Terminalhalfform
s(42ch
aracters)
Single-Stroke16
× × × ×
Two-Stroke 17Above 16
dot 9
other-than-dot 7
Below 1 × ×
Three-Stroke 4Above 4
dot 3
other-than-dot 1
Below × × ×
Four-Stroke 5Above 4
dot 4
other-than-dot ×
Below 1 × ×
8. Three-stroke characters with dot diacritic below the major stroke (1
character)
9. Four-stroke characters with dot diacritic above the major stroke (3
characters)
10. Four-stroke characters with dot diacritic below the major stroke (3
characters)
Medial Half-forms: All medial half-forms are pre-classified as follows:
1. Single-stroke characters (8 characters)
54
Pre-Classification
Table 4.2: Characters recognized at pre-classification stage and don’t require anyfurther classification
Half-Form Subset/sub-Subset Character Recognition Rate
Initial
Medial
Medial
Terminal
Terminal
Terminal
3-stroke dot below
2-stroke other than
dot below
3-stroke dot below
2-stroke dot below
3-stroke other than
dot above
4-stroke dot below
Í È Î K
{ „
100%
100%
100%
100%
100%
100%
2. Two-stroke characters with dot diacritic above the major stroke (6 char-
acters)
3. Two-stroke characters with other-than-dot diacritic above the major
stroke (4 characters)
4. Two-stroke characters with dot diacritic below the major stroke (2 char-
acters)
5. Two-stroke characters with other-than-dot diacritic below the major
stroke (1 characters)
6. Three-stroke characters with dot diacritic above the major stroke (2
characters)
7. Three-stroke characters with other-than-dot diacritic above the major
stroke (2 characters)
8. Three-stroke characters with dot diacritic below the major stroke (1
character)
9. Four-stroke characters with dot diacritic above the major stroke (2
characters)
55
Pre-Classification
10. Four-stroke characters with dot diacritic below the major stroke (2
characters)
Terminal Half-forms: All terminal half-forms are pre-classified as follows:
1. Single-stroke characters (16 characters)
2. Two-stroke characters with dot diacritic above the major stroke (9 char-
acters)
3. Two-stroke characters with other-than-dot diacritic above the major
stroke (7 characters)
4. Two-stroke characters with dot diacritic below the major stroke (1 char-
acters)
5. Three-stroke characters with dot diacritic above the major stroke (3
characters)
6. Three-stroke characters with other-than-dot diacritic above the major
stroke (1 characters)
7. Four-stroke characters with dot diacritic above the major stroke (4
characters)
8. Four-stroke characters with dot diacritic below the major stroke (1
characters)
Figures 4.1 to 4.3 show pictorial description of pre-classification of initial-, medial-
, and terminal half-form characters. Table 4.1 records the the pre-classification
results in tabular form.
4.3 Further Reflections of Pre-Classification
Classification of Urdu characters into subsets and sub-subsets through pre-
classification reflects that:
For 3-stroke characters there occur no such case where we could have a char-
acter with other-than-dot diacritic below the major stroke.
56
Pre-Classification
For 4-stroke characters there exist characters only with dot diacritic either
above or below the major stroke and no other-than-dot diacritic is present
for this case.
Uncontested characters are filtered out due to pre-classification. An uncon-
tested character is the one which stands alone in its respective subset or
sub-subset. For example, we see that character ‘Í’, in three-stroke charac-
ters group, stands alone in its subset having no competition for any further
classification. Uncontested characters from all half-forms are given in Table
4.2. These are fully recognized at pre-classification stage and do not require
any further recognition.
With the small subsets produced by the pre-classifier, it becomes possible to design
banks of simple artificial neural networks (ANN), support vector classifiers (SVCs),
deep belief networks, or recurrent neural networks for fine classification within the
subsets.
57
Chapter 5
Features Extraction
Selection of appropriate features for recognition tasks is necessary for achieving
high performance [84]. Computing suitable features, in every online system, helps
reducing the computational complexity of a pattern recognition problem [85].
However, selection and extraction of such features do not follow any specific tech-
nique. Variations involved in one kind of a problem manifests that a feature
set designated for a particular problem may not necessarily be satisfactory for a
similar problem. One can deduce the fact that no widely accepted feature set
contemporarily exists that can survive successfully for at least one kind of prob-
lems [86]. To reduce computational complexity prominent features are acquired
from preprocessed data. However, optimum size of feature vector to recognize a
handwritten character depends on the complexity involved.
For Arabic/Urdu handwritten characters recognition, different types of fea-
tures have been presented in literature, namely structural features, statistical fea-
tures, and global transformation features. Using structural features [30, 36, 37], a
model/standard template is designed for each class of letters that contains all the
significant information with which test classes are compared. Statistical approach
uses the information of the underlying statistical distribution of some measurable
events or phenomena of interest in the input data [32, 33]. Character recogni-
tion problem has also been addressed in transformed domains using Fourier trans-
form, discrete cosine transform, wavelet transform or Gabor, andWalsh-Hadamard
transform etc. [29, 87].
58
Features Extraction
0 0.5 10
0.5Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.1
0.2Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-5
0
5detail of X
time0 10 20ap
prox
coe
ffs
-1
0
1approx. of Y
time0 5 10
deta
il co
effs
-5
0
5detail of Y
(a) Top row shows character ‘sheen’ in medial form, and x(t) and y(t) of itsmajor stroke. Second and third rows show level-2 db2 wavelet approximation,and level-4 db2 wavelet detail coefficients of x(t) and y(t) respectively
0 0.5 10
0.5Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.2
0.4Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-2
0
2detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-5
0
5detail of Y
(b) Top row shows character ‘zwad ’ in medial form, and x(t) and y(t) of itsmajor stroke. Second and third rows show level-2 db2 wavelet approximation,and level-4 db2 wavelet detail coefficients of x(t) and y(t) respectively
Figure 5.1: Wavelet coefficients for ‘sheen’ and ‘zwad’
59
Features Extraction
In this research work, the problem of online handwritten Urdu characters in
half-forms recognition have been addressed using three types of features separately.
These are:
1. Wavelet Transform
2. Structural Features
3. Sensory Input Values
5.1 Wavelet Transform
To discriminate characters from each other, a human reader looks for the exact
location of smooth regions, sharp turns, and cusps as the landmarks of interest.
With structural, statistical and global transformation features as used in [29, 30,
32, 33, 36, 37] it is not possible to find out these landmarks exactly. In proposed
study, wavelet transformation of handwritten stroke data enables us to accurately
pinpoint the mentioned landmarks and leads to attain better recognition rates.
Wavelet transformation is applied to a signal/image to determine and an-
alyze the localized features of a signal/image, i.e. a time-scale representation of
that signal/image. It is a multi-resolution technique that clips data into differ-
ent frequency components, and then analyzes each component with a resolution
matched to its scale [88]. Wavelet series expansion of a function f(x) is given in
Equation5.1.
f(x) =∑
k
cjo (k)ϕjo,k (x) +
∞∑
j=jo
∑
k
dj (k) Ψj,k (x) (5.1)
where cjo (k) are approximation (or scaling) coefficients, and dj (k) are detailed
(or wavelet) coefficients [89]. Details about the wavelets can be studied from [88]
however a brief review of wavelet properties can be studied from [90].
60
Features Extraction
-50 0 500
50Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.5
1Y-coordinates
time0 10 20ap
prox
coe
ffs
-1
0
1approx. of X
time0 5 10
deta
il co
effs
-1
0
1detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-1
0
1detail of Y
(a) ‘alif ’ in terminal form and its waveletstransform
0 50 1000
20
40Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.2
0.4Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-1
0
1detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-2
0
2detail of Y
(b) ‘bari ye’ in terminal form and itswavelets transform
-50 0 50-10
0
10Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.2
0.4Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-1
0
1detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-1
0
1detail of Y
(c) ‘swad ’ in initial form and its waveletstransform
0 100 2000
50
100Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.5
1Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-1
0
1detail of X
time0 10 20ap
prox
coe
ffs
-1
0
1approx. of Y
time0 5 10
deta
il co
effs
-2
0
2detail of Y
(d) ‘two-eyed-hay ’ in terminal form and itswavelets transform
Figure 5.2: Wavelet coefficients for different Urdu characters in their half-forms.Top row shows the character, and x(t) and y(t) of its major stroke. Second andthird rows show level-2 db2 wavelet approximation, and level-4 db2 wavelet detailcoefficients of x(t) and y(t) respectively
In character recognition problems, wavelet transform has been used for lan-
guages like English, Chinese, Arabic, Persian, and different Indian languages as
well [57, 59, 61, 85, 91–93]. To verify the discriminating potential of wavelet fea-
tures for handwritten Urdu characters in half-forms, a multilevel one-dimensional
wavelet analysis is applied to the preprocessed data. For this purpose, differ-
ent wavelet-families are used in which approximation and detail coefficients are
obtained for the x(t) and y(t) coordinates of the handwritten strokes.
61
Features Extraction
0 0.5 10
0.5
1Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.2
0.4Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-2
0
2detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-2
0
2detail of Y
(a) Top row shows character ‘Tay ’ in medial form, and x(t) and y(t) of itsmajor stroke. Second and third rows show level-2 db2 wavelet approximation,and level-4 db2 wavelet detail coefficients of x(t) and y(t) respectively
0 0.5 10
0.5
1Handwritten minor stroke
0 10 200
0.5
1X-coordinates
0 10 200
0.5
1Y-coordinates
time0 5 10ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-1
0
1detail of X
time0 5 10ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-2
0
2detail of Y
(b) Top row shows the minor stroke ‘towey ’ and its x(t) and y(t) coordinates.Second and third rows show the level-2 db2 wavelet approximation and level-2db2 wavelet detail coefficients of x(t) and y(t) of minor stroke respectively
Figure 5.3: Wavelet coefficients for ‘Tay’
62
Features Extraction
5.1.1 Daubechies Wavelets
Daubechies wavelets [94] are helpful to solve problems where self-similarity prop-
erties of a signal are prominent. Morever, Daubechies family is also useful to deal
with signal discontinuities. Both of these properties, self-similarity and disconti-
nuities, are inherent features of Urdu characters. Therefore, in order to obtain
better classification accuracy Daubechies db2 family is applied to Urdu character
signals under consideration. Both approximation and detailed coefficients are used
in the feature vector. In fact, keeping the feature vector as small as possible, it is
found after some trials that the level-2 approximation coefficients and level-4 de-
tail coefficients were providing the best classification accuracy. The feature vector
is
W =[
−−→cA2x
−−→cD4x
−−→cA2y
−−→cD4y
]T
∈ Rn (5.2)
where−−→cA2x and
−−→cA2y are the vectors of level-2 approximation coefficients, and
−−→cD4x and
−−→cD4y are the vectors of level-4 detail coefficients of the one dimensional
x(t) and y(t) signals of the stroke coordinates (x(t), y(t)). C++ or MATLAB
codes may be used to obtain the wavelet coefficients. In this study MATLAB is
used for wavelet transformations.
5.1.2 Discrimination Power of Wavelets
Figures 5.1 and 5.2, show different handwritten characters in half-forms along with
their wavelet profiles. Each of these figures show the handwritten stroke and x(t),
and y(t) signals of the major stroke in the top row, the second row shows the−−→cA2x
and−−→cA2y coefficients, while the third row shows the
−−→cD4x and
−−→cD4y coefficients.
From the figures it can easily be observed that the wavelet coefficients of different
characters are quite different from each other. Such dissimilarity provide the
promise of wavelet features to present good discrimination power. The results
have verified that using wavelet features, in the way presented above, provided
high recognition rates.
63
Features Extraction
0 0.5 10
0.5Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.2
0.4Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-2
0
2detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-1
0
1detail of Y
(a) Top row shows character ‘kaafI ’ in medial form, and x(t) and y(t) of itsmajor stroke. Second and third rows show level-2 db2 wavelet approximation,and level-4 db2 wavelet detail coefficients of x(t) and y(t) respectively
0 0.5 10
0.2
0.4Handwritten minor stroke
0 10 200
0.5
1X-coordinates
0 10 200
0.2
0.4Y-coordinates
time0 5 10ap
prox
coe
ffs
-1
0
1approx. of X
time0 5 10
deta
il co
effs
-1
0
1detail of X
time0 5 10ap
prox
coe
ffs
-1
0
1approx. of Y
time0 5 10
deta
il co
effs
-1
0
1detail of Y
(b) Top row shows the minor stroke ‘kash’ and its x(t) and y(t) coordinates.Second and third rows show the level-2 db2 wavelet approximation and level-2db2 wavelet detail coefficients of x(t) and y(t) of minor stroke respectively
Figure 5.4: Wavelet coefficients for ‘kaafI’
64
Features Extraction
Figures 5.3 to 5.5 are representative of the case where other-than-dot minor
stroke is involved. In this case there are characters having similar major strokes
and were distinguishable from each other only on the basis of the shape of their
minor strokes. Since the minor stroke here is significantly long, the wavelet coef-
ficients of the minor stroke is also included along with the wavelet coefficients of
the major stroke to form the feature vector.
Sometimes, because of variability of hand movements and flow of writing
the minor difference(s) between the shapes are omitted. Therefore, one shape
might look like the other. Such scenarios give birth to confusions among different
characters. For example, the character ‘ghain’ seems similar to the character ‘fay ’,
both in medial half-forms as shown in Figure 5.6. Another example is shown in
Figure 5.7 in which x(t) and y(t) coordinates look alike for two different characters
‘daal ’ and ‘wao’ in terminal form. The implication behind these similarities is well
explained by handwritten samples for the said characters in Chapter 6. In another
example, the character ‘meem’ in Figure 5.8b is shaped by the user more alike the
character ‘hay ’ in initial form given in Figure 5.9b.
5.1.3 Biorthogonal Wavelets
Biorthogonal wavelets provide the symmetric extensions for finite length signals.
Using bior1.3 in MATLAB, level-2 approximation coefficients and level-4 detail
coefficients of Urdu character signals are employed as feature vectors.
5.1.4 Discrete Meyer Wavelets
Meyer wavelets are actually used for continuous analysis however discrete approx-
imation of Meyer wavelets (dmey) is used as feature vector for Urdu characters in
this study. Again, it is found after some trials that the level-2 approximation co-
efficients and level-4 detail coefficients provided the better classification accuracy.
65
Features Extraction
0 0.5 10
0.5
1Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.2
0.4Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-5
0
5detail of X
time0 10 20ap
prox
coe
ffs
-1
0
1approx. of Y
time0 5 10
deta
il co
effs
-1
0
1detail of Y
(a) Top row shows character ‘hamza’ in medial form, and x(t) and y(t) of itsmajor stroke. Second and third rows show level-2 db2 wavelet approximation,and level-4 db2 wavelet detail coefficients of x(t) and y(t) respectively
0 0.5 10
0.5
1Handwritten minor stroke
0 10 200
0.5
1X-coordinates
0 10 200
0.5
1Y-coordinates
time0 5 10appr
ox c
oeffs
-1
0
1approx. of X
time0 5 10
deta
il co
effs
-1
0
1detail of X
time0 5 10ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-2
0
2detail of Y
(b) Top row shows the minor stroke of ‘hamza’ and its x(t) and y(t) coor-dinates. Second and third rows show the level-2 db2 wavelet approximationand level-2 db2 wavelet detail coefficients of x(t) and y(t) of minor strokerespectively
Figure 5.5: Wavelet coefficients for ‘hamza’
66
Features Extraction
5.2 Structural Features
In this study, for comparison purpose, in addition to wavelet based features, struc-
tural features proposed by Khan and Haider [32,33], have also been employed and
tested. The feature vector includes the following structural aspects of the charac-
ter:
Major Stroke Length
Initial x and y Trend (Major Stroke)
Final x and y Trend (Major Stroke)
Major to Minor Stroke Ratio
Half-Major-Stroke Box-Slope
Cusp in Major Stroke
Cusp in Minor Stroke
Pre-Cusp x and y Trend (Major Stroke)
Terminating Half Plane
Int. of Major Stroke Traj. with Centroid Axes
Is Start. Ord. the Highest Ord. of Major Stroke?
(5.3)
It is shown in the results (Chapter 6) that with wavelet features the recognition
accuracy is far better than that obtained with structural features.
5.3 Sensory Input Values
For a given character, sensory input values are x(t) and y(t) coordinates recorded
through a pen-tablet device. These are the raw data values, just passed through
preprocessing phase and then fed to the classifier as one dimensional array.
67
Features Extraction
0 0.5 10
0.5
1Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.5
1Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-2
0
2detail of X
time0 10 20ap
prox
coe
ffs
-1
0
1approx. of Y
time0 5 10
deta
il co
effs
-2
0
2detail of Y
(a) Top row shows character ‘ghain’ in medial form, and x(t) and y(t) of itsmajor stroke. Second and third rows show level-2 db2 wavelet approximation,and level-4 db2 wavelet detail coefficients of x(t) and y(t) respectively
0 0.5 10
0.5
1Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.5Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-5
0
5detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-2
0
2detail of Y
(b) Top row shows character ‘fay ’ in medial form, and x(t) and y(t) of itsmajor stroke. Second and third rows show level-2 db2 wavelet approximation,and level-4 db2 wavelet detail coefficients of x(t) and y(t) respectively
Figure 5.6: Wavelet coefficients for ‘ghain’ and ‘fay’
68
Features Extraction
0 200 4000
50
100Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.2
0.4Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-5
0
5detail of X
time0 10 20ap
prox
coe
ffs
-1
0
1approx. of Y
time0 5 10
deta
il co
effs
-5
0
5detail of Y
(a) Top row shows character ‘daal ’ in terminal form, and x(t) and y(t) of itsmajor stroke. Second and third rows show level-2 db2 wavelet approximation,and level-4 db2 wavelet detail coefficients of x(t) and y(t) respectively
0 100 2000
50
100Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.5
1Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-2
0
2detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-1
0
1detail of Y
(b) Top row shows character ‘wao’ in terminal form, and x(t) and y(t) of itsmajor stroke. Second and third rows show level-2 db2 wavelet approximation,and level-4 db2 wavelet detail coefficients of x(t) and y(t) respectively
Figure 5.7: Wavelet coefficients for ‘daal’ and ‘wao’. Due to flow of writing bythe users ‘daal’ has included a loop which makes its wavelets transform similar to‘wao’
69
Features Extraction
-50 0 50-10
0
10Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.2
0.4Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-2
0
2detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-2
0
2detail of Y
(a) ‘meem’ in initial form: version-1, and itswavelets transform
0 50 1000
20
40Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.5Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-1
0
1detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-5
0
5detail of Y
(b) ‘meem’ in initial form: version-2, andits wavelets transform
0 50 1000
10
20Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.2
0.4Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-2
0
2detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-2
0
2detail of Y
(c) ‘meem’ in initial form: version-3, and itswavelets transform
Figure 5.8: Wavelet coefficients for three different handwritten samples of ‘meem’in initial form. Top row shows the character ‘meem’ in initial form written dif-ferently by different users, and x(t) and y(t) of its major stroke. Second andthird rows show level-2 db2 wavelet approximation, and level-4 db2 wavelet detailcoefficients of x(t) and y(t) respectively
70
Features Extraction
0 50 1000
20
40Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.5Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-2
0
2detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-5
0
5detail of Y
(a) ‘hay ’ in initial form and its waveletstransform
0 50 1000
10
20Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.2
0.4Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-1
0
1detail of X
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of Y
time0 5 10
deta
il co
effs
-1
0
1detail of Y
(b) The same character ‘hay ’ in initial formwritten a bit more like ‘meem’ in fig 5.8b
0 500
20
40Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.5
1Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-5
0
5detail of X
time0 10 20ap
prox
coe
ffs
-1
0
1approx. of Y
time0 5 10
deta
il co
effs
-2
0
2detail of Y
(c) ‘hay ’ in medial form and its waveletstransform
0 50 1000
20
40Handwritten character
0 50 1000
0.5
1X-coordinates
0 50 1000
0.5
1Y-coordinates
time0 10 20ap
prox
coe
ffs
0
0.5
1approx. of X
time0 5 10
deta
il co
effs
-2
0
2detail of X
time0 10 20ap
prox
coe
ffs
-1
0
1approx. of Y
time0 5 10
deta
il co
effs
-1
0
1detail of Y
(d) ‘hay ’ in terminal form and its waveletstransform
Figure 5.9: Wavelet coefficients for Urdu character ‘hay ’ in their half-forms. Toprow shows character ‘hay ’, and x(t) and y(t) of its major stroke. Second andthird rows show level-2 db2 wavelet approximation, and level-4 db2 wavelet detailcoefficients of x(t) and y(t) respectively
The reason for employing different types of features is to find a better
recognition solution for handwritten Urdu characters. How much one kind of a
feature proved itself helpful is discussed in Chapter 6.
71
Chapter 6
Final Classification
For ‘fine and final ’ classification, different classifiers from traditional artificial neu-
ral networks to deep learning machines are employed. Support vector machines
(SVMs) have also been manoeuvred using different wavelet families. For final
classification two ways are adopted. On one way, it is achieved with proposed
pre-classification and on the other way it is accomplished without going through
proposed pre-classification. However, to reach the goal of fine classification, the
route of final classification through pre-classification proved to be far better than
the other one. Below are the classification techniques used in this research work.
6.1 Final Classifiers with Pre-Classification
Classifiers along with their input feature-types are listed below.
• Artificial neural networks with Daubechies (db2 ) wavelets and with struc-
tural features.
• Support vector machines with Daubechies (db2 ), with Biorthogonal
(bior1.3 ), and with Discrete Meyer (dmey) wavelets and with sensory
input, separately for each of mentioned input feature-type.
• Deep belief networks with Discrete Meyer (dmey) wavelets.
• Deep belief networks using AutoEncoders with Discrete Meyer (dmey)
wavelet features and with sensory input values.
72
Final Classification
Table 6.1: Features-classifier Summary
Pre-classification Features Classifier
with pre-classification
Daubechies wavelets ANN
Structural ANN
Daubechies wavelets SVM
Biorthogonal wavelets SVM
Discrete Meyer wavelets SVM
Sensory input values SVM
Discrete Meyer wavelets DBN
Sensory input values AutoEncoders-DBN
Discrete Meyer wavelets AutoEncoders-DBN
Discrete Meyer wavelets AutoEncoders-SVM
Sensory input values AutoEncoders-SVM
Sensory input values RNN
without pre-classification
Discrete Meyer wavelets SVM
Sensory input values SVM
Discrete Meyer wavelets DBN
Sensory input values DBN
Discrete Meyer wavelets AutoEncoders-DBN
Sensory input values AutoEncoders-DBN
Sensory input values AutoEncoders-SVM
Sensory input values RNN
• AutoEncoders-SVM classifier with sensory input values and with Discrete
Meyer (dmey) wavelet features.
• Recurrent neural networks using sensory input values.
For fine classification of each character within the subsets produced by the pre-
classifier, a dedicated classifier is designed for each of the subsets. In this work, the
responses of artificial neural networks (ANNs) and support vector machine (SVM)
classifiers along with different input features described in Chapter 5 have been
studied. Moreover, recurrent neural networks (RNNs) and deep belief networks
73
Final Classification
(DBNs) have also been applied to compare the responses obtained through ANN
and SVM.
6.2 Final Classifiers without Pre-Classification
Classifiers used without pre-classification along with their input feature-types are
listed below.
• Support vector machines with Discrete Meyer (dmey) wavelets and with
sensory input.
• Deep belief networks with Discrete Meyer (dmey) wavelets and with sensory
input values.
• Deep belief networks using AutoEncoders with Discrete Meyer (dmey)
wavelet features and also with sensory input values.
• AutoEncoders-SVM classifier with sensory input values .
• Recurrent neural networks using sensory input values.
The classification results show that pre-classification of Urdu characters-set plays a
vital role in achieving greater recognition accuracy. Table 6.1 presents a summary
of classification techniques used in this study along with the features which are
used by a particular classifier.
6.3 Artificial Neural Networks
For pattern recognition problems, developing a multi-layer perceptron (MLP) neu-
ral network with backpropagation algorithm is very popular approach [95–99]. The
ANNs used in this work are single or multi-layer Back Propagation Neural Net-
works (BPNN). A sample structure of multi-layer perceptron neural network is
shown in Figure 6.1. For each of the 22 subsets (cardinality ≥ 2), an ANN is
configured, trained, and tested. In this way a bank of ANNs is obtained in which
each neural network serves to recognize a specific character subset. There are two
different banks of ANNs:
74
Final Classification
1. ANNs which are trained using structural features. All of these ANNs consist
of not more than 3-layers with a few number of neurons in each layer.
2. ANNs which are trained using wavelet db2 approximation and detailed co-
efficients. Table 6.2 presents configurations of these ANNs.
In MATLAB environment, from 10800 Urdu half-form samples, all ANNs are
trained on 40% (40 instances of each character) and tested for remaining 60%
(6480 samples) of the data-set.
x1
x2
x3
x4
xp
……
.
……
.
Inputs Layer 1
Layer 2
Output Layer
y
Figure 6.1: Multi-layer perceptron neural network
6.4 Support Vector Machines (SVMs)
SVMs are also widely used for pattern classification and recognition [99, 100].
Speciality of SVM is that the minimization of empirical classification error and
maximization of geometric margins occur simultaneously. Using SVM with pre-
classification, six separate banks are trained to classify the data-set. Three more
banks of SVM without any pre-classification are also trained and tested on the
data. See Table 6.1 for details.
SVMs are setup using LIBSVM (MATLAB) [101]. LIBSVM offers to select
different types of kernel functions (e.g. linear, polynomial, radial basis function
(RBF), sigmoid etc.) with various parameters of these kernels. For the proposed
75
Final Classification
Table 6.2: ANN configurations (trained using wavelet db2 approximation anddetailed coefficients).
Target Group No. of hidden layers Neurons in hiddenlayer-
1 2 3
ANN Configuration: Initial Half Forms
Single-stroke 3 9 9 5
2-stroke dot Above 2 9 6 -
2-stroke other- Above 2 9 6 -
2-stroke dot Below 1 1 - -
2-stroke other- Below 1 1 - -
3-stroke dot Above 2 2 3 -
3-stroke other- Above 2 2 3 -
4-stroke dot Above 2 6 3 -
4-stroke dot Below 2 4 3 -
ANN Configuration: Medial Half Forms
Single-stroke 3 8 6 8
2-stroke dot Above 2 9 9 -
2-stroke other- Above 2 8 6 -
2-stroke dot Below 1 1 0 -
3-stroke dot Above 2 3 3 -
3-stroke other- Above 1 2 - -
4-stroke dot Above 2 4 2 -
4-stroke dot Below 2 4 2 -
ANN Configuration: Terminal Half Forms
Single-stroke 3 8 8 16
2-stroke dot Above 2 7 9 -
2-stroke other- Above 2 7 7 -
3-stroke dot Above 1 2 - -
4-stroke dot Above 2 4 2 -
76
Final Classification
study, C-SVM (multi-class classification) with radial basis function is employed.
For the selection of good parameters, the training set is used with 5-fold cross
validation and optimized values are obtained (of cost of constraint violation C and
γ in radial basis function). All the SVM banks are then trained with randomly
selected 40% of sample data, while tested on 60% of the remaining data.
6.5 Recurrent Neural Networks: Long Short-Term Mem-
ory
Recurrent neural networks (RNNs) (Figure 6.2) introduce a notion of time to the
traditional feedforward artificial neural networks enabling the network to make
use of the temporal patterns present in the sequential data. In a sequential set of
data, the current output depends on previously computed values. RNNs are ele-
vated with the inclusion of edges that span the adjacent time steps. For sequence
learning, Long Short-Term Memory (LSTM) and Bidirectional Recurrent Neural
Networks (BRNN) are considered to be the most successful RNN architectures.
In LSTM RNNs traditional nodes in the hidden layer of a network are replaced
by a memory unit. The architecture of Bidirectional Recurrent Neural Networks
utilize the information from both the past and the future to compute the output
at any point in the sequence [102]. It helped the recurrent neural networks to be
applicable to cursively handwritten scripts more efficiently.
zt-2 zt-1 zt zt+1
Input Layer
Output Layer
Hid
den
Lay
ers
ot-2 ot-1 ot ot+1
Figure 6.2: Bidirectional multi-layer recurrent neural network
77
Final Classification
Input
Output
Hidden
Layer
Edge to Next
Time-Step
Figure 6.3: A simple recurrent neural network. Along solid edges activation ispassed as in feed-forward network. Along dashed edges a source node at eachtime t is connected to a target node at each following time t+1
For a simple recurrent neural network shown in Figure 6.3 the following
Equation 6.1 and Equation 6.2 express all calculations necessary for computation
at each time step on the forward pass [102]
h(t) = σ(W hXx(t) +W hhh(t−1) + bh) (6.1)
y(t) = softmax(W yhh(t) + by) (6.2)
where, WhX is conventional weight matrix between input-layer and the hidden-
layer, Whh is recurrent-weight matrix computed at adjacent time-steps between
the hidden-layer and itself, and bh and by are bias terms that allow each node to
learn an off-set.
In this work, using RNNLIB [103] with Python language, RNNs with LSTM
architecture, without any feature extraction (using only the preprocessed sensory
input) and with/without using the proposed pre-classification, are applied to the
handwritten data . With proposed pre-classification, each subset is presented to
a recurrent neural network which is specifically trained for that subset. Results of
RNN classifier without using the proposed pre-classifier have also been obtained to
78
Final Classification
check the end-to-end capability of the RNN classifier. Using the raw stroke data
saved in inkml file, each RNN is trained, validated, and tested on 30%, 20%, and
50% of randomly selected subsets of the data-set respectively. To recognize the
108 online handwritten Urdu characters altogether, that is without going through
pre-classification, it took more than 100 hours for recurrent neural network to
produce maximally accurate results. Table 6.3 shows configurations for RNNs
used for each subset.
6.6 Deep Belief Network
Deep belief networks (DBNs) were introduced by Hinton in 2006 [104] to explore
the dependencies between hidden and visible units [105]. To set up a DBN, a
bank of restricted Boltzmann machines (RBM) [106] are stacked on top of each
other and in that way a special type of Bayesian probabilistic generative model is
formed. The layers of RBM are connected in such a way that the visible layer of
each RBM is anchored to the hidden layer of the previous RBM. The connection
between the upper layer and the lower layer is set in a top-down manner [107].
Each nonlinear layer of the DBN learns gradually more complex structures of
data to solve pattern classification problems in a promising way [108]. Problems
successfully addressed by variants of deep generative models include visual object
recognition, speech recognition, natural language processing, information retrieval,
and regression analysis [109].
In this research work, deep belief networks are also implemented for recogni-
tion of online handwritten Urdu characters. The DBN classifiers are implemented
with pre-classification and without pre-classification. Using sensory input as well
as wavelet features, each DBN is trained, validated, and tested on 30%, 20%, and
50% of randomly selected subsets of the data-set respectively. For all chracter-
subsets, Table 6.4 shows number of RBMs stacked upon each other with number
of neurons in each RBM. Discrete Meyer wavelets have been used as input features
for this configuration.
79
Final Classification
Table 6.3: RNN configurations (trained using sensory input values).
Target Group No. oflayers
HiddenBlock
HiddenSize
HiddenType
RNN Configuration: Initial Half Forms
Single-stroke 8 1 100 LSTM
2-stroke dot Above 11 1 100, 200 LSTM
2-stroke other- Above 8 2 739 LSTM
2-stroke dot Below 11 1 32, 19 LSTM
2-stroke other- Below 8 2; 4 59 LSTM
3-stroke dot Above 8 2 100 LSTM
3-stroke other- Above 8 1 2 LSTM
4-stroke dot Above 8 5; 2 385 LSTM
4-stroke dot Below 13 2; 2 121, 7 LSTM
RNN Configuration: Medial Half Forms
Single-stroke 8 1 100 LSTM
2-stroke dot Above 8 1 100 LSTM
2-stroke other- Above 8 1 23 LSTM
2-stroke dot Below 8 3 10 LSTM
3-stroke dot Above 8 1 7 LSTM
3-stroke other- Above 8 3; 6 8, 8 LSTM
4-stroke dot Above 8 6; 5 262 LSTM
4-stroke dot Below 8 3 6 LSTM
RNN Configuration: Terminal Half Forms
Single-stroke 8 2 174 LSTM
2-stroke dot Above 8 1 71 LSTM
2-stroke other- Above 8 1 100 LSTM
3-stroke dot Above 8 1 3 LSTM
4-stroke dot Above 8 2 100 LSTM
6.7 AutoEncoders
AutoEncoders (AEs) [110,111] have a key role in deep structured architectures and
unsupervised learning. An AutoEncoder aims to learn salient features for a set of
80
Final Classification
Table 6.4: DBN configurations (trained using wavelet dmey approximation anddetailed coefficients).
DBN-RBMs
Target Set GenerativeRBM-I
PerformanceMethod:
Reconstruction
GenerativeRBM-II
PerformanceMethod:
Reconstruction
DiscriminativeRBM
PerformanceMethod:
Classification
VisibleNeurons
HiddenNeurons
VisibleNeurons
HiddenNeurons
VisibleNeurons
HiddenNeurons
Initial Half-Forms
Single-stroke 209 500 500 500 508 2000
2-stroke dot Above 209 500 500 500 507 2000
2-stroke other- Above 209 500 500 500 507 2000
2-stroke dot Below 209 500 500 500 504 2000
2-stroke other- Below 209 500 500 500 503 1200
3-stroke dot Above 209 500 500 500 504 2000
3-stroke other- Above 209 500 500 500 503 2000
4-stroke dot Above 209 500 500 300 304 2000
4-stroke dot Below 209 500 500 500 504 193
Medial Half-Forms
Single-stroke 209 500 500 500 509 2000
2-stroke dot Above 209 870 870 700 707 500
2-stroke other- Above 209 700 700 1000 1005 2000
2-stroke dot Below 209 500 500 500 503 1000
3-stroke dot Above 209 850 850 950 953 2020
3-stroke other- Above 209 950 950 850 853 2020
4-stroke dot Above 209 500 500 500 503 1000
4-stroke dot Below 209 50 50 50 53 500
Terminal Half-Forms
Single-stroke 209 500 500 750 767 200
2-stroke dot Above 209 413 413 600 610 717
2-stroke other- Above 209 700 700 830 838 700
3-stroke dot Above 209 400 400 500 504 800
4-stroke dot Above 209 300 300 453 458 700
input data, usually to reduce the dimensionality of the data. In recent years, the
AutoEncoders have been applied to learn generative models of data. Besides con-
tinuous features extraction AutoEncoder filters out the noisy element of the input.
A hidden-layer of an AutoEncoder can be encoded by another hidden-layer which
results in stacking of AutoEncoders. To produce additional structural properties
of input data many variants of AutoEncoder model have been proposed, for exam-
ple Denoising AutoEncoders, Sparse AutoEncoder etc. Moreover, AutoEncoders
81
Final Classification
can be combined with other machine learning algorithms like artificial neural net-
works, or SVMs for classification purposes [112].
AutoEncoders are implemented here for dimensionality reduction in case of
discrete Meyer wavelet features and also for sensory input values. The extracted
features from the AutoEncoder are then fed to SVMs to get comparable recognition
results. All the DBNs and AutoEncoders are implemented using Deep Belief
Network (DeeBNet) toolbox [113].
6.8 Results and Discussions
The recognition results of online Urdu handwritten characters using classification
techniques accounted previously, are given here. The results can be categorized
mainly in two types.
• Recognition results obtained after applying the proposed pre-classification,
and
• Recognition results obtained without applying proposed pre-classification.
6.9 Results with Pre-Classification
The pre-classifier produced a total of 28 subsets and sub-subsets from the set of
108 half-form characters (see Table 4.1 in chapter 4). Out of these 28 subsets
and sub-subsets there are 6 subsets containing only one character and do not
need any further classification (Chapter 4 Table 4.2). For different classifiers and
different types of features described earlier, the remaining 22 subsets are tried for
classification. A comparable range of recognition results can be seen in Tables 6.5
and 6.6.
6.10 Results without Pre-Classification
Urdu characters under consideration for this study have also been tried to recognize
without the proposed pre-classification. Support vector machines, Deep Belief
AutoEncoders and classifiers, and Recurrent neural networks are used to achieve
the recognition task. The results are reported in Table 6.7.
82
Final Classification
Table 6.5: Recognition rates for each subset of handwritten half-form Urdu char-acters obtained from the pre-classifier. Results obtained with using ANNs, andSVMs using different features are presented for comparison.
Character-Subset
Recognition Rate (%)ANN SVM
Structural db2 db2 bior-
1.3
dmey SensoryInput
Initial Half-FormsSingle-stroke 80.7 93.3 94.7 95.9 93.7 95.42-stroke dot Above 81.3 90.2 99.1 98.0 96.0 99.02-stroke other- Above 76.3 87.7 91.9 98.0 93.3 89.32-stroke dot Below 92.2 94.4 97.2 97.7 96.0 98.62-stroke other- Below 90.0 97.5 98.3 96.6 96.0 96.03-stroke dot Above 88.8 97.7 94.4 95.5 97.3 94.63-stroke other- Above 99.1 98.3 100 100 100 100
3-stroke dot Below ‘Í’ 1004-stroke dot Above 77.7 89.4 88.8 83.7 89.3 86.64-stroke dot Below 88.8 89 92.7 93.3 93.3 94.0
Medial Half-FormsSingle-stroke 62.7 83.7 89.1 90 92.0 93.72-stroke dot Above 58.0 81.6 93.6 91.3 92.0 90.02-stroke other- Above 80.4 91.6 93.3 94.1 94.0 93.02-stroke dot Below 99.0 99.1 98.3 100 99.0 100
2-stroke other- Below ‘È’ 100
3-stroke dot Above 95.0 98.3 95.0 98.3 95.0 97.03-stroke other- Above 94.1 97.5 95.8 97.5 97.0 98.03-stroke dot Below ‘Î’ 1004-stroke dot Above 87.5 97.5 95.8 95.8 97 96.04-stroke dot Below 97.5 100 100 100 100 100
Terminal Half-FormsSingle-stroke 78.4 81.7 94.7 94.2 96.7 96.32-stroke dot Above 66.6 93.3 96.7 97.2 90.4 91.52-stroke other- Above 82.6 95.7 99.0 99.2 99.1 99.42-stroke dot Below ‘K’ 1003-stroke dot Above 93.3 95.5 99.4 100 99.3 1003-stroke other- Above ‘{’ 1004-stroke dot Above 94.1 97.9 99.6 99.1 100 98.54-stroke dot Below ‘„’ 100Overall Accuracy (%) 80.9 91.0 95.5 95.3 95.4 95.4
83
Final Classification
Table 6.6: Recognition rates for each subset of handwritten Urdu characters ob-tained from the pre-classifier. Results obtained with DBN, AE-DBN, AE-SVMand RNN using different features are presented for comparison.
Character-Subset
Recognition Rate (%)DBN AE-DBN AE-SVM RNNdmey Sensory
Inputdmey dmey Sensory
InputSensoryInput
Initial Half-FormsSingle-stroke 95.0 94.8 92.5 93.7 96.0 75.42-stroke dot Above 94.0 94.6 96.3 95.3 99.3 84.62-stroke other- Above 89.6 87.6 90.6 91.3 86.0 73.32-stroke dot Below 97.3 96.6 98.0 96.6 98.6 94.02-stroke other- Below 96.0 99.0 98.0 94.0 97.0 90.03-stroke dot Above 96.6 96.6 95.3 91.3 98.6 91.33-stroke other- Above 98.0 100 97.0 100 100 97.0
3-stroke dot Below ‘Í’ 1004-stroke dot Above 84.0 85.3 84.6 84.0 90.6 85.34-stroke dot Below 93.3 94.6 93.3 93.3 92.6 88.6
Medial Half-FormsSingle-stroke 86.2 90.2 86.7 92 93.0 71.52-stroke dot Above 91.6 83.6 91.3 93.6 86.3 74.02-stroke other- Above 94.0 93.5 96.0 91.5 93.0 73.52-stroke dot Below 99.0 99.0 100 99.0 100 96.0
2-stroke other- Below ‘È’ 100
3-stroke dot Above 95.0 93.0 94.0 97.0 97.0 90.03-stroke other- Above 97.0 97.0 98.0 97.0 97.0 94.03-stroke dot Below ‘Î’ 1004-stroke dot Above 96.0 97.0 95.0 93.0 94.0 86.04-stroke dot Below 100 100 100 100 100 99.0
Terminal Half-FormsSingle-stroke 91.3 94.7 89.8 90.2 96.6 84.62-stroke dot Above 85.5 87.7 87.1 89.1 92.8 92.02-stroke other- Above 97.1 97.4 97.4 98.0 98.5 91.72-stroke dot Below ‘K’ 1003-stroke dot Above 99.3 100 99.3 98.6 100 99.33-stroke other- Above ‘{’ 1004-stroke dot Above 99.0 98.5 99.0 98.5 99.0 97.54-stroke dot Below ‘„’ 100Overall Accuracy (%) 93.2 93.7 93.2 93.7 95.3 85.9
84
Final Classification
Table 6.7: Recognition rates for half-form Urdu characters without going throughpre-classification. Results are obtained using SVMs, DBN, AE-DBN, AE-SVMand RNN using different features.
Classifier Recognition
Rate (%)
Features Type Number of Features
SVM 55.0 Sensory input values 224
SVM 42.2 Discrete Meyer wavelets 239
DBN 51.3 Sensory input values 224
DBN 46.3 Discrete Meyer wavelets 239
AE-SVM 51.2 Sensory input values 100
AE-DBN 45.0 Sensory input values 99
AE-DBN 35.5 Discrete Meyer wavelets 99
RNN 67.3 Sensory input values Variable stroke length
6.11 Maximum Recognition Rate
Accuracy of online Urdu handwritten character recognition can be viewed accord-
ing to following perspectives:
Overall Accuracy: Overall accuracy (subsection 6.11.1) describes the recogni-
tion rates delivered by different classifiers taking every thing into account.
With pre-classification in practice, for different classifier-feature combina-
tions the overall accuracy ranges between 85.9% to 95.5% whereas without
pre-classification, different classifier-feature combinations produce accura-
cies between 35.5% to 67.3%.
subset-wise Accuracy: It describes (subsection 6.11.2), from the bank of clas-
sifiers, how much a classifier is proved to be successful for the subsets. Ac-
curacies for all the subsets with different classifier-feature combinations is
presented in Tables 6.5 and 6.6 where values in boldface are the maximum
accuracy for the subset among different classifier-feature combinations.
Character-wise Accuracy: It is important to know how much accurately a
character is recognized by the classifiers with different features (subsection
85
Final Classification
6.11.3). In this study, 29% of the characters are recognized 100% accurately
while 58% of the characters achieved more than 90% recognition accuracy.
Rest of the characters successfully attained more than 80% recognition rate.
6.11.1 Overall Accuracy
with Pre-Classification: For various features-classifier combinations, maximum
overall recognition accuracy of 95.5% is occasioned by db2 -SVM com-
bination for all the subsets obtained through pre-classification. This max-
imum accuracy is computed with the inclusion of subsets in which char-
acters are recognized at pre-classification stage. (These subsets are given
in Table 4.2). Tables 6.5 and 6.6 show that support vector machines per-
formed the best among all the classifiers mentioned previously. In fact we
see that, all those combinations which involved SVMs, whether features-
classifier combination (db2 -SVM, biorthogonal -SVM, dmey-SVM, sensory
input values-SVM) or classifier-classifier combination (AutoEncoder-SVM)
showed up with the maximum range of accuracy, that is more than 95%.
ANN with db2 wavelet features provided somewhat lesser overall accuracy
of 91% as compared to SVM. For ANNs with structural features the overall
accuracy of 80.89% is significantly lower as compared to all other combi-
nations. Moreover, RNNs with multi LSTM hidden layers of varying sizes
delivered overall 85.9% accurate recognition results. Deep belief networks
whether used with discrete Meyer wavelets or with sensory input provided
about 93% accurate results. DBN and SVM resulted in overall accuracies of
93.7 and 95.3 respectively while using features extracted by AutoEncoders
from the sensory input.
without Pre-Classification: In this case, the maximum of 67.3% overall accu-
racy is delivered by recurrent neural network using sensory input values.
For other classifiers (like SVM, DBN, AE-DBN) the recognition accuracy
ranges between 35.5% to 55.0% as shown in Table 6.7. The huge difference
between the accuracies achieved through pre-classification and the accura-
cies obtained without pre-classification illustrates the effectiveness of the
proposed pre-classification.
86
Final Classification
6.11.2 Polling or Subset-wise Accuracy
In Tables 6.5 and 6.6, the results obtained through different classifier-feature com-
binations are presented in a subset-wise manner. Through polling, i.e. considering
the maximum recognition rate for the subsets among all the classifiers, it can be
observed that there are 11 (out of 28), 100% accurately recognized subsets. There
are 13 subsets for which the recognition accuracy is ≥ 96% whereas the remaining
4 subsets are recognized with more than 90% accurate results. With subset-wise
maximum recognition rates, the overall accuracy for the system becomes 97.2%.
This accuracy is greater than the accuracy yielded through SVM+db2 -wavelet-
features (i.e. 95.5%). It suggests to use a bank of classifiers for Urdu characters
recognition rather than relying on a single classifier.
6.11.3 Character-wise Accuracy
Recognition accuracy for each character is given in Character-wise Accuracy Chart
Table 6.8. There are 32 characters out of 108 (total target shapes) which are 100%
accurately recognized. There are 44 characters which showed up with more than
95% accuracy. The accuracy of 19 characters lied between 90% to 95% while the
accuracy of 12 characters are between 80% to 90%. The terminal half-form ‘f’
has the minimum accuracy of 73.3%.
6.12 Error Analysis using Confusion Matrices
Some confusion matrices are presented in this section for the best and worst cases
of the best classifier-feature combination i.e. SVM+db2 -wavelet-features. In each
confusion matrix, X is a representative mark for some unknown class. The con-
fusion matrices for all subsets can be seen in Appendix A.
Table 6.9 shows the confusion matrix of a subset that contains 3 characters.
The overall accuracy of this subset is 88.8%. It is among the lowest accuracies
obtained with the SVM+db2 -wavelet-features combination.
87
Final Classification
Table 6.8: Characters accuracy chart
Character
Accuracy%
L I JK
95 98.3 93.3 96.6 100
… ‚ ƒ „ P M N O
83.3 95.0 100 100 90.0 95.0 98.3 100
“ — ‘ ’ T Q R S
81.6 90.0 85 98.3 85.0 91.6 91.6 100
U V W † ‡ ˆ
100 100 98.3 100 100 100
Y Z \ a b c96.6 95.0 96.6 100 100 93.3
f • h j ™ l ‹73.3 98.3 96.6 96.6 100 100 100
m n o q r s
93.3 93.3 98.3 90.0 100 98.3
u v w y z |
93.3 85.0 90.0 98.3 98.3 98.3
ø ¡ ñ £ ¤ ¥98.3 86.6 96.6 100 98.3 96.6
§ ¨ © « ¬ −
95.0 80.0 96.6 100 85.0 90.0
¯ ° ± ³ ´ µ98.3 85.0 96.6 98.3 91.6 98.3
· ù ¹ º ¸
100 98.3 100 95.0 100
Œ õ ò ô {
100 100 98.3 93.3 100
88
Final Classification
Table 6.8 continued...
Character
Accuracy%
» ¼ ½ ¿ À Á100 96.6 98.3 86.6 83.3 98.3
Ã Ä Å Ì Û è é
98.3 95.0 100 90.0 98.3 93.3 98.3ì ä È É ú Ñ98.3 98.3 100 96.6 93.3 100
Ö å × š Ł95.0 86.6 93.3 100 100
Í Î Ó ˛
100 100 93.3 98.3
Table 6.9: Confusion matrix for 4-stroke characters (initial half-form) with dotdiacritic above the major stroke
T Q q X
T 51 7 2 0 60
Q 2 55 3 0 60
q 3 3 54 0 6056 65 59 0
Table 6.10 shows the confusion matrix of a subset containing 6 characters.
The recognition accuracy of 91.9% for this subset. The character å is 3 times
misclassified as Ö and 4 times misclassified as“. This should be expected because
of the shape similarity among these characters. Similarly“is 7 times misclassified
as — and 4 times misclassified aså for the same reason.
Table 6.11 shows the confusion matrix for another subset yielding low over-
all recognition accuracy (93.6%) with the SVM+db2-wavelet-features combina-
tion. The main culprits for the low accuracy in this subset are the characters
° and¬. Although ° and¬ have distinct major strokes in standard form with¬89
Final Classification
Table 6.10: Confusion matrix for initial half-forms 2-stroke characters with other-than-dot diacritic above the major stroke. Overall accuracy for this subset is91.9%
Ö å · ù “ — X
Ö 57 2 0 0 1 0 0 60
å 3 52 1 0 4 0 0 60
· 0 0 60 0 0 0 0 60
ù 0 0 1 59 0 0 0 60
“ 0 4 0 0 49 7 0 60
— 2 2 0 0 2 54 0 6062 60 62 59 56 61 0
Table 6.11: Confusion matrix for medial half-form 2-stroke characters with dotdiacritic above the major stroke. Overall accuracy for this subset is 93.6%
° ¬ b Ä ¤ z X
° 51 5 0 1 0 3 0 60
¬ 7 51 0 1 0 1 0 60
b 0 0 60 0 0 0 0 60
Ä 1 1 0 57 1 0 0 60
¤ 0 0 1 0 59 0 0 60
z 0 0 0 0 1 59 0 6059 57 61 59 61 63 0
having a cusp in its major stroke, many writers ignore this cusp while handwriting
¬ casually. The¬ then appears very much similar to °. This is confirmed by the
confusion matrix which shows that ¬ is 7 times misclassified as °. Removing ¬and ° from this subset results in 96.6% accuracy (Confusion matrix in table 6.12).
Removing only ° results in 96.3% accuracy, while removing only ¬ gives 96.6%
accuracy (Confusion matrices in Tables 6.13 and 6.14).
Confusion matrix of another subset yielding low overall accuracy of 93.3%
is presented in Table 6.15. Here × and ‘ are responsible for the low recognition
rate. Both characters have same major stroke but distinct minor strokes, so minor
stroke was also utilized for feature vector formation. But casual penning of minor
90
Final Classification
Table 6.12: Confusion matrix for medial half-form 2-stroke characters with dotdiacritic above the major stroke excluding both ° and¬. Overall accuracy for thissubset is 96.6%
b Ä ¤ z X
b 60 0 0 0 0 60
Ä 1 56 1 2 0 60
¤ 1 0 59 0 0 60
z 0 0 3 57 0 6062 56 63 59 0
strokes results in similar shapes of the minor strokes. Consequently ‘ is 9 times
misclassified as×.Table 6.16 and Table 6.17 present two subsets showing high overall recog-
nition accuracy.
6.12.1 Confusing Characters
In Urdu there are a few groups of characters in which the major stroke is common
to the group and the discrimination is made on the basis of minor strokes. This
similarity is inherent to Urdu and the similar characters were put into different
subsets by the pre-classifier. There is another kind of similarity between different
characters which arises from the careless writing by the user. This user imposed
similarity occurs inside the subsets produced by the pre-classifier and results in
confusing pairs of characters within a subset.
Table 6.13: Confusion matrix for medial half-form 2-stroke characters with dotdiacritic above the major stroke excluding °. Overall accuracy for this subset is96.3%
¬ b Ä ¤ z X
¬ 58 0 1 0 1 0 60
b 1 59 0 0 0 0 60
Ä 0 1 56 1 2 0 60
¤ 0 1 0 59 0 0 60
z 0 0 0 3 57 0 6059 61 57 63 60 0
91
Final Classification
Table 6.14: Confusion matrix for medial half-form 2-stroke characters with dotdiacritic above the major stroke excluding¬. Overall accuracy for this subset is96.6%
° b Ä ¤ z X
° 54 0 2 0 4 0 60
b 0 60 0 0 0 0 60
Ä 1 0 58 1 0 0 60
¤ 0 1 0 59 0 0 60
z 0 0 0 1 59 0 6055 61 60 61 63 0
Table 6.15: Confusion matrix for medial half-forms 2-stroke characters with other-than-dot diacritic above the major stroke. Overall accuracy for this subset is 93.3%
× ¹ º ‘ X
× 56 1 0 3 0 60
¹ 0 60 0 0 0 60
º 0 3 57 0 0 60
‘ 9 0 0 51 0 6065 64 57 54 0
Table 6.16: Confusion matrix for terminal half-forms 2-stroke characters with dotdiacritic above the major stroke. Overall accuracy for this subset is 96.7%
± −
W c Å h l ¥ | X
± 58 0 0 0 0 0 0 0 2 0 60
−
0 54 0 6 0 0 0 0 0 0 60W 0 1 59 0 0 0 0 0 0 0 60
c 0 4 0 56 0 0 0 0 0 0 60Å 0 0 0 0 60 0 0 0 0 0 60
h 0 0 0 0 0 58 2 0 0 0 60l 0 0 0 0 0 0 60 0 0 0 60
¥ 0 0 0 1 0 1 0 58 0 0 60| 1 0 0 0 0 0 0 0 59 0 60
59 59 59 63 60 59 62 58 61 0
Figure 6.4 shows few handwritten samples of two confusing characters °(Fay) and¬ (Ghain) present in the subset shown in the confusion matrix of Table
6.11. If drawn according to rules, the character¬ should have a well defined cusp
92
Final Classification
Table 6.17: Confusion matrix for terminal half-forms 4-stroke characters with dotdiacritic above the major stroke. Overall accuracy for this subset is 99.6%
ˆ S s ‹ X
ˆ 60 0 0 0 0 60S 0 60 0 0 0 60s 0 0 59 1 0 60
‹ 0 0 0 60 0 6060 60 59 61 0
Fay
Ghain
Fay
Ghain
Fay
Ghain
Fay
Ghain
Figure 6.4: Handwritten samples of ° (Fay) and¬ (Ghain). The¬ (Ghains) areconfusingly similar to the ° (Fays).
in its major stroke. Some users do not draw the cusp while writing casually or in
hurry. The¬ drawn in this way appears like ° to a human reader as well and can
be seen in Figure 6.4. The classifier also many times misclassified¬ as ° and vice
versa as shown in confusion matrix of Table 6.11.
Another pair of confusing characters is shown in Figure 6.5. These are the
characters × (hamza) and ‘ (Tay) in medial form. The major stroke for both
the characters is the same and the discrimination is made on the basis of minor
stroke. Many users casually draw the minor stroke of ‘ in a very much similar
way to the minor stroke of×. The confusion matrix in Table 6.15 for this subset
confirms this where it can be seen that‘ has been 9 times misclassified as×.In medial half-form a pair of¨and v when handwritten (Figure 6.6) causes
confusion due to the absence of tooth required to be drawn with v. Handwritten
samples shown in figures 6.7 and 6.8 also depict confusing resemblance between ¿
and u in initial half-forms andf and Ì in teminal half-forms.
93
Final Classification
hmza
Tay
hmza
Tay
hmza
Tay
hmza
Tay
Figure 6.5: Handwritten samples of × (hamza) and ‘ (Tay). The ‘ (Tays) are
confusingly similar to the× (hamzas).
Ain Ain Ain Ain
Suwad Suwad Suwad Suwad
Figure 6.6: Handwritten samples of ¨ (ain) and v (swad). The ¨ (ains) areconfusingly similar to the v (swads).
Meem Meem Meem Meem
Suwad Suwad Suwad Suwad
Figure 6.7: Handwritten samples of ¿ (meem) and u (swad). The ¿ (meems) areconfusingly similar to the u (swads).
Daal Daal Daal Daal
Wao Wao Wao Wao
Figure 6.8: Handwritten samples of f (daal) and Ì (wao). The f (daals) areconfusingly similar to the Ì (waos).
94
Chapter 7
Conclusion
In this study, a novel character recognition system for Urdu language online hand-
written characters is presented. All initial, medial, and terminal half form charac-
ters have been recognized. A large scale handwriting data set was acquired from
100 native Urdu writers of different age groups and educational qualifications. The
data was acquired using a digitizing tablet. Spatial coordinates in temporal or-
der with their respective pressure values, and pen up/down events were recorded.
The raw data was refined after its manipulation with different preprocessing oper-
ations. A novel pre-classifier was designed to pre-classify Urdu characters set into
smaller subsets. The pre-classifier yielded smaller subsets based on the number of
strokes to yield two-, three-, four-stroke subsets. The pre-classifier further divided
the subsets based on the position of the minor stroke with respect to the major
stroke, and also on the basis of whether the minor stroke is a dot or other-than-
dot. The pre-classifier helped in discriminating similar characters from each other
by putting them in different subsets. Three types of features, namely structural
features, wavelet transform features, and sensory input values were extracted.
Wavelet features were obtained using Daubechies db2, Biorthogonal bior1.3, and
discrete Meyer families. ANN, SVM, DBN, AE-DBN, AE-SVM and RNN clas-
sifiers were used for fine classification of the individual characters in the subsets
generated by the pre-classifier. The classifiers were also employed without going
through proposed pre-classification. Results of RNN (LSTM) classifier without
using the proposed pre-classifier and features were also obtained to check the
end-to-end capability of the RNN classifier. Since there is no sufficient previous
95
Conclusion
work for comparison, different combinations of features and classifiers were tried
to find the best recognition results. Thirteen (13) different features-classifier com-
binations were tried which resulted in overall accuracies of 80.9%, 91.0%, 95.5%,
95.3%, 95.4%, 95.4% with classical approaches and 93.2% , 95.7%, 93.2%, 93.7%,
95.3%, 85.9% with deep learning classifiers (DBN, AEDBN, AESVM, RNN). The
best overall recognition rate of 95.5% was found for SVM+db2 -wavelet-features
combination. For individual characters, recognition rates obtained were between
80% to 100% with an exception of one character that obtained the accuracy of
73.3%. Overall accuracy for different subsets was between 88.8% to 100% for
SVM+db2 -wavelet-features combination, and overall accuracy for all initials, me-
dials, and terminals were 95.1%, 93.9%, and 97.0% respectively. We have followed
the segmentation-based approach which requires extraction of half forms of char-
acters from the ligatures. The data was actually obtained in segmented form from
the users. Research on segmentation of ligatures into half-form characters is also
being carried out in parallel to this work. RNNs promise of end-to-end recognition
capability was also explored but was found to yield inferior results as compared
to the classical feature-based approaches of SVM and ANN, however, DBNs pro-
duced comparable results. The results with RNNs may be improved if more data
is added to the database.
7.1 Future Work
For future work the followings are of major concern:
• Focus on segmentation of ligatures into half forms and recognition of Urdu
handwritten words.
• Increase in the size of database of handwritten Urdu characters.
• Implementation on touch screens and android based smartphones.
• Classification using deep convolutional neural networks.
• Implementation of recognition system on digital signal processor
Another interesting idea for future recommendation is to capture and classify in
real time the character data while the writer is writing on a page.
96
Conclusion
In future, with the increased size of database, other deep learning methods
like deep belief networks and convolutional neural networks may be employed.
Other kinds of features may also be explored.
97
Appendix A
Confusion Matrices
A.1 Confusion Matrices of Support Vector Classifier with
db2 -Wavelet-Features
Confusion matrices are presented in this section for all subsets of the best classifier-
feature combination i.e. SVM+db2 -wavelet-features. The overall accuracy for
this classifier-feature combination is 95.53%. In each confusion matrix, X is a
representative mark for some unknown class.
Table A.1: Confusion matrix for single-stroke characters (initial half-form). Itcontains 7 characters. Overall accuracy: 94.7%
§ Y » ¿ m u ø X
§ 57 0 0 0 0 0 3 0 60Y 0 58 0 2 0 0 0 0 60
» 0 0 60 0 0 0 0 0 60¿ 0 3 0 52 0 4 1 0 60m 0 0 1 3 56 0 0 0 60u 0 1 0 3 0 56 0 0 60
ø 1 0 0 0 0 0 59 0 6058 62 61 60 56 60 63 0
99
Confusion Matrices
Table A.2: Confusion matrix for 2-stroke characters (initial half-form) with dotdiacritic above the major stroke. It contains 6 characters. Overall accuracy: 99.1%
¯ « a à £ y X
¯ 59 1 0 0 0 0 0 60
« 0 60 0 0 0 0 0 60
a 0 0 60 0 0 0 0 60
à 0 1 0 59 0 0 0 60
£ 0 0 0 0 60 0 0 60
y 1 0 0 0 0 59 0 6060 62 60 59 60 59 0
Table A.3: Confusion matrix for 2-stroke characters (initial half-form) with other-than-dot diacritic above the major stroke. It contains 6 characters. Overall accu-racy: 91.9%
Ö å · ù “ — X
Ö 57 2 0 0 1 0 0 60
å 3 52 1 0 4 0 0 60
· 0 0 60 0 0 0 0 60
ù 0 0 1 59 0 0 0 60
“ 0 4 0 0 49 7 0 60
— 2 2 0 0 2 54 0 6062 60 62 59 56 61 0
Table A.4: Confusion matrix for 2-stroke characters (initial half-form) with dotdiacritic below the major stroke. It contains 3 characters. Overall accuracy: 97.2%
L I U XL
59 1 0 0 60I
4 56 0 0 60U 0 0 60 0 60
63 57 60 0
100
Confusion Matrices
Table A.5: Confusion matrix for 2-stroke characters (initial half-form) with other-than-dot diacritic below the major stroke. It contains 2 characters. Overall accu-racy: 98.3%
ì äXì
59 1 0 60ä1 59 0 6060 60 0
Table A.6: Confusion matrix for 3-stroke characters (initial half-form) with dotdiacritic above the major stroke. It contains 3 characters. Overall accuracy: 94.4%
³ P M X
³ 59 0 1 0 60P 0 54 6 0 60
M 1 2 57 0 6060 56 64 0
Table A.7: Confusion matrix for 3-stroke characters (initial half-form) with other-than-dot diacritic above the major stroke. It contains 2 characters. Overall accu-racy: 100%
Œ õ X
Œ 60 0 0 60õ 0 60 0 60
60 60 0
Table A.8: Confusion matrix for 4-stroke characters (initial half-form) with dotdiacritic above the major stroke. It contains 3 characters. Overall accuracy: 88.8%
T Q q X
T 51 7 2 0 60
Q 2 55 3 0 60
q 3 3 54 0 6056 65 59 0
101
Confusion Matrices
Table A.9: Confusion matrix for 4-stroke characters (initial half-form) with dotdiacritic below the major stroke. It contains 3 characters. Overall accuracy: 92.7%
† … ‚ X†
60 0 0 0 60… 0 50 10 0 60‚
0 3 57 0 6060 53 67 0
Table A.10: Confusion matrix for single-stroke characters (medial half-form). Itcontains 8 characters. Overall accuracy: 89.1%
¨ Z ¼ À n v ¡ è X
¨ 48 0 3 0 0 6 1 2 0 60Z 0 57 0 2 0 1 0 0 0 60¼ 0 0 58 0 1 1 0 0 0 60À 2 3 1 50 0 4 0 0 0 60n 0 0 0 2 56 2 0 0 0 60v 6 0 1 0 2 51 0 0 0 60¡ 2 2 2 0 0 1 52 1 0 60è 1 0 0 1 0 1 1 56 0 60
59 62 65 55 59 67 54 59 0
Table A.11: Confusion matrix for 2-stroke characters (medial half-form) with dotdiacritic above the major stroke. It contains 8 characters. Overall accuracy: 93.6%
° ¬ b Ä ¤ z X
° 51 5 0 1 0 3 0 60
¬ 7 51 0 1 0 1 0 60
b 0 0 60 0 0 0 0 60
Ä 1 1 0 57 1 0 0 60
¤ 0 0 1 0 59 0 0 60
z 0 0 0 0 1 59 0 6059 57 61 59 61 63 0
102
Confusion Matrices
Table A.12: Confusion matrix for 2-stroke characters (medial half-form) withother-than-dot diacritic above the major stroke. It contains 4 characters. Overallaccuracy: 93.3%
× ¹ º ‘ X
× 56 1 0 3 0 60
¹ 0 60 0 0 0 60
º 0 3 57 0 0 60
‘ 9 0 0 51 0 6065 64 57 54 0
Table A.13: Confusion matrix for 2-stroke characters (medial half-form) with dotdiacritic below the major stroke. It contains 2 characters. Overall accuracy: 98.3%
J V XJ
58 2 0 60V 0 60 0 60
58 62 0
Table A.14: Confusion matrix for 3-stroke characters (medial half-form) with dotdiacritic above the major stroke. It contains 2 characters. Overall accuracy: 95.0%
´ N X
´ 55 5 0 60N 1 59 0 60
56 64 0
Table A.15: Confusion matrix for 3-stroke characters (medial half-form) withother-than-dot diacritic above the major stroke. It contains 2 characters. Overallaccuracy: 95.8%
ò ô X
ò 59 1 0 60ô 4 56 0 60
63 57 0
103
Confusion Matrices
Table A.16: Confusion matrix for 4-stroke characters (medial half-form) with dotdiacritic above the major stroke. It contains 2 characters. Overall accuracy: 95.8%
R r X
R 55 5 0 60
r 0 60 0 6055 65 0
Table A.17: Confusion matrix for 4-stroke characters (medial half-form) with dotdiacritic below the major stroke. It contains 2 characters. Overall accuracy: 100%
‡ ƒX‡
60 0 0 60ƒ 0 60 0 6060 60 0
Table A.18: Confusion matrix for single-stroke characters (terminal half-form). Itcontains 16 characters. Overall accuracy: 94.7%
© š f \ ½ Ó Á j É ú o w ñ é Ì X© 58 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 60
0 57 0 0 0 0 0 0 0 1 0 0 0 2 0 0 0 60š 0 0 60 0 0 0 0 0 0 0 0 0 0 0 0 0 0 60
f 0 0 0 44 0 1 0 0 3 0 0 1 0 1 0 10 0 60\ 2 0 0 0 58 0 0 0 0 0 0 0 0 0 0 0 0 60½ 0 0 0 0 0 59 0 0 0 0 0 0 0 0 1 0 0 60Ó 0 0 0 0 0 1 56 0 0 0 0 3 0 0 0 0 0 60Á 0 0 0 0 0 0 0 59 0 0 1 0 0 0 0 0 0 60j 0 0 0 0 0 0 0 1 58 0 1 0 0 0 0 0 0 60É 1 0 0 0 1 0 0 0 0 58 0 0 0 0 0 0 0 60ú 0 0 0 0 0 0 0 1 3 0 56 0 0 0 0 0 0 60o 0 0 0 0 0 0 0 0 0 0 0 59 1 0 0 0 0 60w 0 0 0 0 0 2 0 0 0 0 0 2 54 0 1 1 0 60ñ 0 0 0 1 0 0 0 0 0 0 0 1 0 58 0 0 0 60é 0 0 0 0 0 0 0 0 0 0 0 0 0 1 59 0 0 60Ì 0 0 0 4 0 0 0 0 1 0 0 0 0 0 1 54 0 60
61 57 60 49 60 63 56 62 65 59 58 66 55 62 62 65 0
104
Confusion Matrices
Table A.19: Confusion matrix for 2-stroke characters (terminal half-form) withdot diacritic above the major stroke. It contains 9 characters. Overall accuracy:96.7%
± −
W c Å h l ¥ | X
± 58 0 0 0 0 0 0 0 2 0 60
−
0 54 0 6 0 0 0 0 0 0 60W 0 1 59 0 0 0 0 0 0 0 60
c 0 4 0 56 0 0 0 0 0 0 60Å 0 0 0 0 60 0 0 0 0 0 60
h 0 0 0 0 0 58 2 0 0 0 60l 0 0 0 0 0 0 60 0 0 0 60
¥ 0 0 0 1 0 1 0 58 0 0 60| 1 0 0 0 0 0 0 0 59 0 60
59 59 59 63 60 59 62 58 61 0
Table A.20: Confusion matrix for 2-stroke characters (terminal half-form) withother-than-dot diacritic above the major stroke. It contains 7 characters. Overallaccuracy: 99.0%
™ • ˛ Ł ¸ ’ Û X
™ 59 1 0 0 0 0 0 0 60
• 0 60 0 0 0 0 0 0 60˛ 0 0 59 0 1 0 0 0 60Ł 0 0 0 60 0 0 0 0 60¸ 0 0 0 0 60 0 0 0 60’ 1 0 0 0 0 59 0 0 60
Û 0 0 0 0 1 0 59 0 6060 61 59 60 62 59 59 0
Table A.21: Confusion matrix for 3-stroke characters (terminal half-form) withdot diacritic above the major stroke. It contains 3 characters. Overall accuracy:99.4%.
µ Ñ O Xµ 59 0 1 0 60
Ñ 0 60 0 0 60O 0 0 60 0 60
59 60 61 0
105
Confusion Matrices
Table A.22: Confusion matrix for 4-stroke characters (terminal half-forms) withdot diacritic above the major stroke. It contains 4 characters. Overall accuracy:99.6%
ˆ S s ‹ X
ˆ 60 0 0 0 0 60S 0 60 0 0 0 60s 0 0 59 1 0 60
‹ 0 0 0 60 0 6060 60 59 61 0
106
Confusion Matrices
A.2 Confusion Matrices of Support Vector Classifier with
Sensory Input Values
Confusion matrices are presented in this section for all subsets of the second best
classifier-feature combination i.e. SVM+ sensory input values. The overall accu-
racy for this classifier-feature combination is 95.45%. In each confusion matrix,
X is a representative mark for some unknown class.
Table A.23: Confusion matrix for single-stroke characters (initial half-form). Itcontains 7 characters. Overall accuracy: 95.4%
§ Y » ¿ m u ø X
§ 48 0 0 0 0 0 2 0 50Y 0 49 0 1 0 0 0 0 50
» 0 0 50 0 0 0 0 0 50¿ 1 4 0 42 0 3 0 0 50m 0 0 0 0 50 0 0 0 50u 0 0 0 3 1 46 0 0 50
ø 0 1 0 0 0 0 49 0 5049 54 50 46 51 49 51 0
Table A.24: Confusion matrix for 2-stroke characters (initial half-form) with dotdiacritic above the major stroke. It contains 6 characters. Overall accuracy: 99.0%
¯ « a à £ y X
¯ 48 1 0 0 0 1 0 50
« 0 50 0 0 0 0 0 50
a 0 0 50 0 0 0 0 50
à 0 0 0 50 0 0 0 50
£ 0 0 1 0 49 0 0 50
y 0 0 0 0 0 50 0 5048 51 51 50 49 51 0
107
Confusion Matrices
Table A.25: Confusion matrix for 2-stroke characters (initial half-form) with other-than-dot diacritic above the major stroke. It contains 6 characters. Overall accu-racy: 89.3%
Ö å · ù “ — X
Ö 45 1 0 0 0 4 0 50
å 4 35 0 0 10 1 0 50
· 0 0 50 0 0 0 0 50
ù 0 0 1 49 0 0 0 50
“ 0 4 0 0 42 4 0 50
— 2 0 0 0 1 47 0 5051 40 51 49 43 56 0
Table A.26: Confusion matrix for 2-stroke characters (initial half-form) with dotdiacritic below the major stroke. It contains 3 characters. Overall accuracy: 98.6%
L I U XL
48 2 0 0 50I
0 50 0 0 50U 0 0 50 0 50
48 52 50 0
Table A.27: Confusion matrix for 2-stroke characters (initial half-form) with other-than-dot diacritic below the major stroke. It contains 2 characters. Overall accu-racy: 96.0%
ì äXì
50 0 0 50ä4 46 0 5054 46 0
108
Confusion Matrices
Table A.28: Confusion matrix for 3-stroke characters (initial half-form) with dotdiacritic above the major stroke. It contains 3 characters. Overall accuracy: 94.6%
³ P M X
³ 50 0 0 0 50P 0 43 7 0 50
M 0 1 49 0 5050 44 56 0
Table A.29: Confusion matrix for 3-stroke characters (initial half-form) with other-than-dot diacritic above the major stroke. It contains 2 characters. Overall accu-racy: 100%
Œ õ X
Œ 50 0 0 50õ 0 50 0 50
50 50 0
Table A.30: Confusion matrix for 4-stroke characters (initial half-form) with dotdiacritic above the major stroke. It contains 3 characters. Overall accuracy: 86.6%
T Q q X
T 35 8 7 0 50
Q 1 48 1 0 50
q 2 1 47 0 5038 57 55 0
Table A.31: Confusion matrix for 4-stroke characters (initial half-form) with dotdiacritic below the major stroke. It contains 3 characters. Overall accuracy: 94.0%
† … ‚ X†
50 0 0 0 50… 0 43 7 0 50‚
0 2 48 0 5050 45 55 0
109
Confusion Matrices
Table A.32: Confusion matrix for single-stroke characters (medial half-form). Itcontains 8 characters. Overall accuracy: 93.7%
¨ Z ¼ À n v ¡ è X
¨ 41 0 2 0 0 4 1 2 0 50Z 0 48 0 2 0 0 0 0 0 50¼ 2 0 48 0 0 0 0 0 0 50À 0 0 0 49 1 0 0 0 0 50n 0 0 0 0 50 0 0 0 0 50v 0 0 3 0 0 47 0 0 0 50¡ 4 0 0 0 0 0 44 2 0 50è 1 0 0 0 0 1 0 48 0 50
48 48 52 51 51 52 45 52 0
Table A.33: Confusion matrix for 2-stroke characters (medial half-form) with dotdiacritic above the major stroke. It contains 8 characters. Overall accuracy: 90.0%
° ¬ b Ä ¤ z X
° 42 6 0 0 1 1 0 50
¬ 8 41 0 0 0 1 0 50
b 1 0 48 0 1 0 0 50
Ä 1 0 1 47 1 0 0 50
¤ 2 0 1 0 46 1 0 50
z 0 1 0 3 0 46 0 5054 48 50 50 49 49 0
Table A.34: Confusion matrix for 2-stroke characters (medial half-form) withother-than-dot diacritic above the major stroke. It contains 4 characters. Overallaccuracy: 93.0%
× ¹ º ‘ X
× 40 1 1 8 0 50
¹ 0 50 0 0 0 50
º 0 0 50 0 0 50
‘ 4 0 0 46 0 5044 51 51 54 0
110
Confusion Matrices
Table A.35: Confusion matrix for 2-stroke characters (medial half-form) with dotdiacritic below the major stroke. It contains 2 characters. Overall accuracy: 100%
J V XJ
50 0 0 50V 0 50 0 50
50 50 0
Table A.36: Confusion matrix for 3-stroke characters (medial half-form) with dotdiacritic above the major stroke. It contains 2 characters. Overall accuracy: 97.0%
´ N X
´ 47 3 0 50N 0 50 0 50
47 53 0
Table A.37: Confusion matrix for 3-stroke characters (medial half-form) withother-than-dot diacritic above the major stroke. It contains 2 characters. Overallaccuracy: 98.0%
ò ô X
ò 49 1 0 50ô 1 49 0 50
50 50 0
Table A.38: Confusion matrix for 4-stroke characters (medial half-form) with dotdiacritic above the major stroke. It contains 2 characters. Overall accuracy: 96.0%
R r X
R 48 2 0 50
r 2 48 0 5050 50 0
111
Confusion Matrices
Table A.39: Confusion matrix for 4-stroke characters (medial half-form) with dotdiacritic below the major stroke. It contains 2 characters. Overall accuracy: 100%
‡ ƒX‡
50 0 0 50ƒ 0 50 0 5050 50 0
Table A.40: Confusion matrix for single-stroke characters (terminal half-form). Itcontains 16 characters. Overall accuracy: 96.3%
© š f \ ½ Ó Á j É ú o w ñ é Ì X© 49 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 50
0 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 50š 0 0 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 50
f 0 0 0 43 0 0 0 0 2 0 0 0 0 0 0 5 0 50\ 1 0 0 0 49 0 0 0 0 0 0 0 0 0 0 0 0 50½ 0 0 0 0 0 50 0 0 0 0 0 0 0 0 0 0 0 50Ó 0 0 0 0 0 0 49 0 0 0 0 1 0 0 0 0 0 50Á 0 0 0 0 0 0 0 49 0 0 1 0 0 0 0 0 0 50j 0 0 0 0 0 0 0 1 49 0 0 0 0 0 0 0 0 50É 0 0 1 0 0 0 0 0 0 49 0 0 0 0 0 0 0 50ú 0 0 0 0 0 0 0 0 4 0 46 0 0 0 0 0 0 50o 0 0 0 0 0 0 0 0 0 0 0 50 0 0 0 0 0 50w 0 0 0 1 0 1 0 0 0 0 0 5 43 0 0 0 0 50ñ 0 0 0 1 0 0 0 0 0 1 0 0 0 48 0 0 0 50é 0 0 0 0 0 0 0 0 0 0 0 0 0 0 49 1 0 50Ì 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 48 0 50
50 50 51 46 50 51 49 50 56 50 47 56 43 48 49 54 0
112
Confusion Matrices
Table A.41: Confusion matrix for 2-stroke characters (terminal half-form) withdot diacritic above the major stroke. It contains 9 characters. Overall accuracy:91.5%
± −
W c Å h l ¥ | X
± 50 0 0 0 0 0 0 0 0 0 50
−
0 49 1 0 0 0 0 0 0 0 50W 0 1 31 18 0 0 0 0 0 0 50
c 0 1 12 37 0 0 0 0 0 0 50Å 0 0 0 0 49 0 0 0 1 0 50
h 1 0 0 0 0 47 1 1 0 0 50l 0 0 0 0 0 0 50 0 0 0 50
¥ 0 0 0 0 0 0 0 50 0 0 50| 1 0 0 0 0 0 0 0 49 0 50
52 51 44 55 49 47 51 51 50 0
Table A.42: Confusion matrix for 2-stroke characters (terminal half-form) withother-than-dot diacritic above the major stroke. It contains 7 characters. Overallaccuracy: 99.4%
™ • ˛ Ł ¸ ’ Û X
™ 49 1 0 0 0 0 0 0 50
• 0 49 0 0 0 1 0 0 50˛ 0 0 50 0 0 0 0 0 50Ł 0 0 0 50 0 0 0 0 50¸ 0 0 0 0 50 0 0 0 50’ 0 0 0 0 0 50 0 0 50
Û 0 0 0 0 0 0 50 0 5049 50 50 50 50 51 50 0
Table A.43: Confusion matrix for 3-stroke characters (terminal half-form) withdot diacritic above the major stroke. It contains 3 characters. Overall accuracy:100%.
µ Ñ O Xµ 50 0 0 0 50
Ñ 0 50 0 0 50O 0 0 50 0 50
50 50 50 0
113
Confusion Matrices
Table A.44: Confusion matrix for 4-stroke characters (terminal half-forms) withdot diacritic above the major stroke. It contains 4 characters. Overall accuracy:98.5%
ˆ S s ‹ X
ˆ 50 0 0 0 0 50S 0 50 0 0 0 50s 0 1 49 0 0 50
‹ 0 0 2 48 0 5050 51 51 48 0
114
Appendix B
Handwritten Urdu Character
Samples
Figure B.1: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-1
115
Handwritten Urdu Character Samples
Figure B.2: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-2
Figure B.3: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-3
116
Handwritten Urdu Character Samples
Figure B.4: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-4
Figure B.5: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-5
117
Handwritten Urdu Character Samples
Figure B.6: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-6
Figure B.7: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-7
118
Handwritten Urdu Character Samples
Figure B.8: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-8
Figure B.9: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-9
119
Handwritten Urdu Character Samples
Figure B.10: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-10
Figure B.11: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-11
120
Handwritten Urdu Character Samples
Figure B.12: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-12
Figure B.13: A handwritten ensemble of all Urdu characters written on the canvaswith the help of a stylus and digitizing tablet by writer-13
121
References
[1] S. D. Connell and A. K. Jain, “Writer adaptation for online handwriting
recognition,” IEEE Transactions on Pattern Analysis and Machine Intelli-
gence, vol. 24, no. 3, pp. 329–346, March 2002.
[2] S. D. Connell and A. K. Jain, “Template based online character recognition,”
Pattern Recognition, The Journal of the Pattern Recognition Society, vol. 34,
pp. 1–14, 2001.
[3] O. Goldwasser, “How the alphabet was born from hieroglyphs, discussion
with anson rainey:,” Biblical Archaeology Review. Washington, DC: Biblical
Archaeology Society., vol. 36(1), pp. 40–53, March/April 2010. [Online].
Available: http://www.bib-arch.org/scholars-study/alphabet.asp
[4] L. Mitchell, “Earliest egyptian glyphs,” Archaeological Institute of
America, vol. 52, no. 2, March/April 1999. [Online]. Available:
archive.archaeology.org/9903/newsbriefs/egypt.html
[5] D. Crowley and P. Heyer, Communication in History: Technology, Culture,
Society. Boston: Allyn and Bacon/Pearson Education Inc., 2003.
[6] M. Kiefer and J.-L. Velay, “Writing in the digital age,” Trends in
Neuroscience and Education, vol. 5, no. 3, pp. 77–81, 2016, writing in
the digital age. [Online]. Available: http://www.sciencedirect.com/science/
article/pii/S2211949316300205
[7] C. B. Walker and W. V. Davies, Reading the Past: Ancient Writing from
Cuneiform to the Alphabet, J. T. Hooker, Ed. Berkeley: University of
California Press/British Museum, 1990.
122
REFERENCES
[8] A. Mangen and J.-L. Velay, Digitizing Literacy: Reflections on the
Haptics of Writing in Advances in Haptics, IntechOpen, M. H. Zadeh,
Ed., 2010. [Online]. Available: https://www.intechopen.com/books/
advances-in-haptics/digitizing-literacy-reflections-on-the-haptics-of-writing
[9] B. Bash, “The Simple Joy of Writing by Hand,” Mindful, taking
time for what matters, vol. April, 2016. [Online]. Available: https:
//www.mindful.org/the-simple-joy-of-writing-by-hand/
[10] K. Yoshida and H. Sakoe, “Online Handwritten Character Recognition for a
Personal Computer System,” IEEE Transactions on Consumer Electronics,
vol. CE-28, pp. 202 – 209, Aug. 1982.
[11] R. B. Miller, “Response Time in Man-computer Conversational Trans-
actions,” in Proceedings of the December 9-11, 1968, Fall Joint
Computer Conference, Part I, ser. AFIPS ’68 (Fall, part I). New
York, NY, USA: ACM, 1968, pp. 267–277. [Online]. Available:
http://doi.acm.org/10.1145/1476589.1476628
[12] M. F. Zafar, D. Mohamad, and R. Othman, “On-line handwritten character
recognition: An implementation of counterpropagation neural net,” World
Academy of Science, Engineering and Technology International Journal of
Computer and Information Engineering, vol. 1, no. 10, 2007.
[13] E. Case, L. Vincent, and U. T. Lead, “Announcing Tesseract
OCR,” 2006. [Online]. Available: http://googlecode.blogspot.com/2006/
08/announcing-tesseract-ocr.html
[14] “GitHub - tesseract-ocr/tesseract: Tesseract Open Source OCR Engine
(main repository),” Retrieved: 2018-07-01. [Online]. Available: https:
//github.com/tesseract-ocr/tesseract#brief-history/
[15] S. V. Rice, F. R. Jenkins, and T. A. Nartker, “The
Fourth Annual Test of OCR Accuracy,” 1995. [Online]. Avail-
able: http://www.expervision.com/wp-content/uploads/2012/12/1995.
The Fourth Annual Test of OCR Accuracy.pdf
123
REFERENCES
[16] “OCR, Canonical Ltd.” 2011. [Online]. Available: https://help.ubuntu.
com/community/OCR
[17] N. Willis, “Google’s Tesseract OCR engine is a quantum leap
forward,” 2006. [Online]. Available: https://www.linux.com/news/
googles-tesseract-ocr-engine-quantum-leap-forward
[18] “Readiris 17, the PDF and OCR solution for Win-
dows.” [Online]. Available: http://www.irislink.com/EN-GB/c1462/
Readiris-16-for-Windows---OCR-Software.aspx
[19] “ABBYY FineReader 14 for Windows: Features,” Retrieved: 2018-07-01.
[Online]. Available: https://www.abbyy.com/en-eu/finereader/in-details/
[20] “OmniPage 18 Save Time and Money with Superior Ac-
curacy, Datasheet,” Retrieved: 2018-07-01. [Online]. Avail-
able: https://www.nuance.com/content/dam/nuance/en us/collateral/
imaging/data-sheet/ds-omnipage-standard-18-en-us.pdf
[21] “FREE ONLINE OCR SERVICE,” Retrieved: 2018-07-01. [Online].
Available: https://www.onlineocr.net/
[22] “MyScript App Support, The Devices Currently Sup-
ported by Nebo,” Retrieved: 2018-07-01. [Online]. Avail-
able: https://app-support.myscript.com/support/solutions/articles/
16000070607-what-are-the-devices-currently-supported-by-nebo-
[23] “MyScript: Nebo,” Retrieved: 2018-07-01. [Online]. Available: https:
//www.myscript.com/nebo
[24] “MyScript App Support, About Right-to-Left Languages,” Retrieved: 2018-
07-01. [Online]. Available: https://app-support.myscript.com/support/
solutions/articles/16000077562-what-about-right-to-left-languages-rtl-
[25] “Telecom indicators,” June 2016, [Online; accessed 14-June-2016]. [Online].
Available: http://www.pta.gov.pk/index.php?Itemid=599
124
REFERENCES
[26] F. Baloch, “Telecom sector: Pakistan to have 40 million smart-
phones by end of 2016,” September ”2015, [Online; accessed 18-
May-2016]. [Online]. Available: http://tribune.com.pk/story/953333/
telecom-sector-pakistan-to-have-40-million-smartphones-by-end-of-2016/
[27] S. T. Javed, S. Hussain, A. Maqbool, S. Asloob, S. Jamil, and M. H., “Seg-
mentation free Nastalique Urdu OCR,” World Academy of Science, Engi-
neering and Technology, vol. 4, 2010.
[28] D. A. Satti and K. Saleem, “Complexities and implementation challenges
in offline Urdu Nastaliq OCR,” in Proceedingds of the Conference on Lan-
guage & Technology 2012 (CLT12), University of Engineering & Technol-
ogy(UET), Lahore, Pakistan, 2012, pp. 85–91.
[29] S. Naz, K. Hayat, M. I. Razzak, M. W. Anwar, S. A. Madani, and S. U.
Khan, “The optical character recognition of Urdu-like cursive scripts,” Pat-
tern Recognition, Elsevier, vol. 47, pp. 1229–1248, 2014.
[30] S. Malik and S. A. Khan, “Urdu online handwriting recognition,” in IEEE
International Conference on Emerging Technologies, 2005.
[31] N. Shahzad, B. Paulson, and T. Hammond, “Urdu Qaeda: recognition sys-
tem for isolated Urdu characters,” in IUI Workshop on Sketch Recognition,
2009.
[32] I. Haider and K. U. Khan, “Online recognition of single stroke handwritten
Urdu characters,” in IEEE 13th International Multitopic Conference (IN-
MIC2009), 2009.
[33] K. U. Khan and I. Haider, “Online recognition of multi-stroke handwritten
Urdu characters,” in Image Analysis and Signal Processing (IASP), 2010.
[34] N. H. Khan, A. Adnan, and S. Basar, “Urdu ligature recognition using
multi-level agglomerative hierarchical clustering,” Cluster Computing, May
2017. [Online]. Available: https://doi.org/10.1007/s10586-017-0916-2
125
REFERENCES
[35] S. Shabbir and I. Siddiqi, “Optical Character Recognition System for Urdu
Words in Nastaliq Font,” International Journal of Advanced Computer Sci-
ence and Applications, vol. 7, no. 5, pp. 567–576, 2016.
[36] M. Hussain and M. N. Khan, “Online Urdu ligature recognition using spatial
temporal neural processing,” in IEEE International Multitopic Conference
(INMIC05), 2005.
[37] S. A. Husain, A. Sajjad, and F. Anwar, “Online Urdu character recognition
system,” in IAPR Machine Vision Applications (MVA2007), Conference on,
2007.
[38] M. I. Razzak, F. Anwar, S. A. Hussain, A. Belaid, and M. Sher, “HMM
and fuzzy logic: A hybrid approach for online Urdu script-based languages’
character recognition,” Knowledge-Based Systems, Elsevier, vol. 23, pp. 914–
923, 2010.
[39] M. I. Razzak, S. A. Hussain, A. M. Abdulrahman, and M. K. Khan, “Bio-
inspired multilayered and multilanguage Arabic script character recognition
system,” International Journal of Innovative Computing Information and
Control, vol. 8, no. 4, pp. 2681–2691, 2012.
[40] S. Naz, I. U. Arif, R. Ahmad, B. A. Saad, S. H. Shirazi, I. Siddiqi, and M. I.
Razzak, “Offline cursive Urdu-Nastaliq script recognition using multidimen-
sional recurrent neural networks,” Neurocomputing, vol. 177, pp. 228–241,
2016.
[41] S. Naz, A. I. Umar, R. Ahmed, M. I. Razzak, S. F. Rashid, and
F. Shafait, “Urdu Nasta’liq text recognition using implicit segmentation
based on multi-dimensional long short term memory neural networks,”
SpringerPlus, vol. 5, no. 1, p. 2010, Nov 2016. [Online]. Available:
https://doi.org/10.1186/s40064-016-3442-4
[42] S. Naz, A. Umar, R. Ahmad, S. Ahmed, S. Shirazi, and M. Razzak, “Urdu
Nasta’liq text recognition system based on multi-dimensional recurrent neu-
ral network and statistical features,” vol. 28, 09 2015.
126
REFERENCES
[43] S. Naz, A. I. Umar, R. Ahmad, I. Siddiqi, S. B. Ahmed, M. I. Razzak, and
F. Shafait, “Urdu Nastaliq recognition using convolutional-recursive deep
learning,” Neurocomputing, vol. 243, pp. 80–87, 2017. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0925231217304654
[44] A. Ul-Hasan, S. B. Ahmed, F. Rashid, F. Shafait, and T. M. Breuel, “Offline
Printed Urdu Nastaleeq Script Recognition with Bidirectional LSTM Net-
works,” in 2013 12th International Conference on Document Analysis and
Recognition, Aug 2013, pp. 1061–1065.
[45] U. Pal and A. Sarkar, “Recognition of printed Urdu script,” in 7th Inter-
national Conference on Document Analysis and Recognition (ICDAR’03),
2003.
[46] Z. Ahmad, J. K. Orakzai, I. Shamsher, and A. Adnan, “Urdu Nastaleeq opti-
cal character recognition,” International Journal of Computer, Information,
Systems and Control Engineering, vol. 1(8), 2007.
[47] G. S. Lehal, “Choice of recognizable unit for Urdu OCR,” in Workshop on
Document Analysis and Recognition (DAR12), 2012.
[48] S. Zaman, W. Slany, and F. Saahito, “Recognition of segmented Ara-
bic/Urdu characters using pixel values as their features,” in ICCIT, 2012.
[49] S. T. Javed and S. Hussain, “Segmentation based Urdu Nastalique OCR,”
in 18th Iberoamerican Congress (CIARP2013), 2013, pp. 41–49.
[50] S. Naz, A. I. Umar, S. Bin Ahmed, S. H. Shirazi, M. I. Razzak, and
I. Siddiqi, “An OCR system for printed Nasta’liq script: A segmentation
based approach,” in IEEE 17th International, Multi-Topic Conference (IN-
MIC’2014), 2014, pp. 255–259.
[51] M. I. Razzak, M. Sher, and S. A. Hussain, “Locally baseline detection for
online Arabic script based languages character recognition,” International
Journal of the Physical Sciences, vol. 5, no. 7, pp. 955–959, 2010.
127
REFERENCES
[52] M. I. Razzak, S. A. Hussain, M. K. Khan, and S. Muhammad, “Handling Di-
acritical Marks for Online Arabic Script Based Languages Character Recog-
nition using Fuzzy c-mean Clustering and Relative Position,” Information-
an International Interdisciplinary Journal, vol. 14, no. 1, pp. 157–165, 2011.
[53] M. I. Razzak, S. A. Husain, A. A. Mirza, and A. Belaid, “Fuzzy based
preprocessing using fusion of online and offline trait for online Urdu script
based languages character recognition,” International Journal of Innovative
Computing, Information and Control, vol. 85(A), pp. 3149–3161, 2012.
[54] E. Qaralleh, G. Abandah, and F. Jamour, “Tuning Recurrent Neural Net-
works for Recognizing Handwritten Arabic Words,” Journal of Software En-
gineering and Applications, vol. 6, no. 10, pp. 533–542, May 2013.
[55] I. A. Jannoud, “Automatic Arabic handwritten text recognition system,”
American journal of applied sciences, vol. 4(11), pp. 857–864, 2007.
[56] A. Asiri and M. S. Khorsheed, “Automatic processing of handwritten Arabic
forms using neural networks,” in World academy of science, engineering and
technology, vol. 7, 2005.
[57] A. A. Aburas and S. M. A. Rehiel, “Off-line omni-style handwriting Arabic
character recognition system based on wavelet Compression,” vol. 3(4), pp.
123–135, 2007.
[58] G. Kour and R. Saabne, “Fast classification of handwritten on-line Arabic
characters,” in 2014 6th International Conference of Soft Computing and
Pattern Recognition (SoCPaR), Aug 2014, pp. 312–318.
[59] A. Mowlaei, K. Faez, and A. T. Haghighat, “Feature extraction with wavelet
transform for recognition of isolated handwritten Farsi/Arabic characters
and numerals,” in IEEE 13th Workshop on Neural Networks for Signal Pro-
cessing, NNSP’03, 2003, pp. 547–554.
[60] A. Broumandnia, J. Shanbehzadeh, and M. R. Varnoosfaderani, “Per-
sian/Arabic handwritten word recognition using M-band packet wavelet
transform,” Image Vision Computing, vol. 26(6), pp. 829–842, 2008.
128
REFERENCES
[61] M. R. Jenabzade, R. Azmi, P. B., and S. Shirazi, “Two methods for recogni-
tion of handwritten Farsi characters,” International Journal of Image Pro-
cessing (IJIP), vol. 5(4), 2011.
[62] S. Nasrollahi and A. Ebrahimi, “Printed Persian Subword Recognition
Using Wavelet Packet Descriptors,” Journal of Engineering, vol. 2013, 2013.
[Online]. Available: https://doi.org/10.1155/2013/465469
[63] V. Ghods and M. K. Sohrabi, “Online Farsi Handwritten Character Recogni-
tion Using Hidden Markov Model,” Journal of Computers, vol. 11(2), 2016.
[64] Q. Safdar and K. U. Khan, “Online Urdu Handwritten Character Recog-
nition: Initial Half Form Single Stroke Characters,” in 12th International
Conference on Frontiers of Information Technology, Dec 2014, pp. 292–297.
[65] K. C. Santosh and E. Iwata, Stroke-based cursive char-
acter recognition, Advances in Character Recognition,
P. X. Ding, Ed. InTech, 2012. [Online]. Available:
http://www.intechopen.com/books/advances-in-character-recognition/
stroke-based-cursive-character-recognition
[66] G. Boccignone, A. Chianese, L. Cordella, and A. Marcelli, “Recovering dy-
namic information from static handwriting,” Pattern Recognition, vol. 26,
pp. 409–418, 1993.
[67] G. C. Viard, L. M. Pierre, and S. Knerr, “Recognition directed recovering
of temporal information from handwritimg images,” Pattern Recognition
Letters, vol. 26, pp. 2537–2548, 2005.
[68] D. S. Doermann and A. Rosenfeld, “Recovery of temporal information from
static images of handwriting,” International Journal of Computer Vision,
vol. 15(1-2), pp. 143–164, 1995.
[69] Y. Qiao, M. Nishiara, and M. Yasuhara, “A framework toward restoration
of writing order from single-stroked handwriting image,” IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 28(11), pp. 1724–1737,
2006.
129
REFERENCES
[70] “Ethnologue: Languages of the World,” https://www.ethnologue.com.
[71] “National Language, THE CONSTITUTION OF THE ISLAMIC REPUB-
LIC OF PAKIATAN,” http://na.gov.pk/uploads/documents/1333523681
951.pdf.
[72] “EIGHTH Schedule, Languages, THE CONSTITUTION OF
INDIA,” https://www.india.gov.in/sites/upload files/npi/files/
coi-eng-schedules 1-12.pdf.
[73] “Language of the Nation, Nepal’s Constitution of 2015,” https://www.
constituteproject.org/constitution/Nepal 2015.pdf.
[74] “PAKISTAN: THE WORLD FACTBOOK,” https://www.cia.gov/library/
publications/the-world-factbook/geos/pk.html.
[75] A. Mirza, “Urdu as a First Language: The Impact of Script on Reading in the
L1 and English as a Second Language,” Ph.D. dissertation, Wilfrid Laurier
University, 2014. [Online]. Available: http://scholars.wlu.ca/etd/1660
[76] G. Cardona and D. Jain, “The Indo-Aryan Languages,” Routledge, Routledge
Language Family Series, 2003.
[77] “National Language Promotion Department.” [Online]. Available: http:
//nlpd.gov.pk
[78] H. P. and I. Sloan, A Grammar of Pashto a Descriptive Study of the Dialect
of Kandahar, Afghanistan. Ishi Press International., 2009.
[79] S. Abdul Khair Kashfi, “Noori Nastaliq Revolution in Urdu Composing,”
2008.
[80] R. Safabakhsh and P. Adibi, “Nastaaligh handwritten word recognition us-
ing a continuous-density variable-duration HMM,” The Arabian Journal for
Science and Engineering, vol. 30(1B), 2005.
[81] A. Muaz, “Urdu Optical Character Recognition System,” Ph.D. dissertation,
National University of Computer & Emerging Sciences Lahore, Pakistan,
2010.
130
REFERENCES
[82] G. A. Abandah and F. T. Jamour., “Recognizing handwritten Arabic script
through efficient skeleton-based grapheme segmentation algorithm,” in 10th
International Conference on Intelligent Systems Design and Applications,
Nov 2010, pp. 977–982.
[83] H. E. Abed, V. Margner, M. Kherallah, and A. M. Alimi, “ICDAR 2009
Online Arabic Handwriting Recognition Competition,” in 2009 10th Inter-
national Conference on Document Analysis and Recognition, July 2009, pp.
1388–1392.
[84] A. Wahi, S. Sundaramurthy, and P. Poovizhi, “Recognition of handwritten
Tamil characters using wavelet,” International Journal of Computer Science
& Engineering Technology (IJCSET), vol. 5(4), 2014.
[85] P. Singh and S. Budhiraja, “Handwritten Gurmukhi character recognition
using wavelet transform,” vol. 2, no. 3, 2012.
[86] S. Jaeger, S. Manke, J. Reichert, and A. Waibel, “Online handwriting recog-
nition: the NPen++ Recognizer,” International Journal of Document Anal-
ysis and Recognition, IJDAR, 2001.
[87] M. D. Al-Hassani, “Optical character recognition system for multifont En-
glish texts using DCT and Wavelet Transform,” vol. 4(6), 2013.
[88] S. Mallat, A wavelet tour of signal processing, The Sparse Way,. Academic
Press Elsevier Inc. San Dieago, 2008.
[89] R. C. Gonzalez and R. E. Woods, Digital Image Processing (3rd Edition).
Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2006.
[90] C. B. Amar, M. Zaied, and A. Alimi, “Beta wavelets. synthesis
and application to lossy image compression,” Advances in Engineering
Software, vol. 36, no. 7, pp. 459 – 474, 2005, advanced Algorithms
and Architectures for Signal Processing. [Online]. Available: http:
//www.sciencedirect.com/science/article/pii/S0965997805000116
131
REFERENCES
[91] D. K. Patel, T. Som, S. K. Yadav, and M. K. Singh, “Handwritten character
recognition using multiresolution technique and euclidean distance metric,”
vol. 3, pp. 208–214, 2012.
[92] W. Wei, L. Ming, G. Weina, W. Dandan, and L. Jing, “A new mind of
wavelet transform for handwritten Chinese character recognition,” in Sec-
ond International Conference on Instrumentation, Measurement, Computer,
Communication and Control (IMCCC), 2012.
[93] K. P. Primekumar and S. M. Idiculla, “On-line Malayalam handwritten char-
acter recognition using wavelet transform and SFAM,” in 3rd International
Conference on Electronics Computer Technology (ICECT), vol. 1, 2011.
[94] I. Daubechies, Ten Lectures on Wavelets. Philadelphia, PA, USA: Society
for Industrial and Applied Mathematics, 1992.
[95] N. Murru and R. Rossini, “A Bayesian approach for initialization of weights
in backpropagation neural net with application to character recognition,”
Neurocomputing, vol. 193, pp. 92 – 105, 2016. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0925231216001624
[96] A. Prieto, B. Prieto, E. M. Ortigosa, E. Ros, F. Pelayo, J. Ortega,
and I. Rojas, “Neural networks: An overview of early research, current
frameworks and new challenges,” Neurocomputing, vol. 214, pp. 242 – 268,
2016. [Online]. Available: http://www.sciencedirect.com/science/article/
pii/S0925231216305550
[97] I. Shamsher, Z. Ahmad, J. K. Orakzai, and A. Adnan, “OCR for printed
Urdu script using feed forward neural network,” World Academy of Science,
Engineering and Technology, vol. 1(10), 2007.
[98] W. A. Salameh and M. A. Otair, “Online handwritten character recogni-
tion using an optical backpropagation neural network,” Issues in Informing
Science and Information Technology, vol. 3, 2005.
[99] T. Sergios and K. Konstantinos, Pattern Recognition, Fourth Edition, 4th ed.
Academic Press, 2008.
132
REFERENCES
[100] J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis.
Cambridge University Press, 2004.
[101] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector
machines,” ACM Transactions on Intelligent Systems and Technology,
vol. 2, pp. 27:1–27:27, 2011. [Online]. Available: http://www.csie.ntu.edu.
tw/∼cjlin/libsvm
[102] Z. C. Lipton, “A Critical Review of Recurrent Neural Networks for
Sequence Learning,” CoRR, vol. abs/1506.00019, 2015. [Online]. Available:
http://arxiv.org/abs/1506.00019
[103] A. Graves, “RNNLIB: A recurrent neural network library for sequence learn-
ing problems,” http://sourceforge.net/projects/rnnl/.
[104] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A Fast Learning Algorithm
for Deep Belief Nets,” Neural Comput., vol. 18, no. 7, pp. 1527–1554, Jul.
2006. [Online]. Available: http://dx.doi.org/10.1162/neco.2006.18.7.1527
[105] W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi, “A
survey of deep neural network architectures and their applications,”
Neurocomputing, vol. 234, pp. 11–26, 2017. [Online]. Available: http:
//www.sciencedirect.com/science/article/pii/S0925231216315533
[106] P. Smolensky, “Parallel Distributed Processing: Explorations in the
Microstructure of Cognition,” D. E. Rumelhart, J. L. McClelland,
and C. PDP Research Group, Eds. Cambridge, MA, USA: MIT
Press, 1986, vol. 1, ch. Information Processing in Dynamical Systems:
Foundations of Harmony Theory, pp. 194–281. [Online]. Available:
http://dl.acm.org/citation.cfm?id=104279.104290
[107] G. E. Dahl, D. Yu, L. Deng, and A. Acero, “Context-Dependent Pre-Trained
Deep Neural Networks for Large-Vocabulary Speech Recognition,” IEEE
Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp.
30–42, Jan 2012.
133
REFERENCES
[108] Y. Bengio, “Learning Deep Architectures for AI,” Found. Trends Mach.
Learn., vol. 2, no. 1, pp. 1–127, Jan. 2009. [Online]. Available:
http://dx.doi.org/10.1561/2200000006
[109] R. Salakhutdinov, “Learning Deep Generative Models,” Ph.D. dissertation,
Toronto, Ont., Canada, Canada, 2009, aAINR61080.
[110] P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol, “Extracting and
Composing Robust Features with Denoising Autoencoders,” in Proceedings
of the 25th International Conference on Machine Learning, ser. ICML ‘08.
New York, NY, USA: ACM, 2008, pp. 1096–1103. [Online]. Available:
http://doi.acm.org/10.1145/1390156.1390294
[111] C. Xing, L. Ma, and X. Yang, “Stacked Denoise Autoencoder Based Feature
Extraction and Classification for Hyperspectral Images,” pp. 1–10, 01 2016.
[112] Y. Ju, J. Guo, and S. Liu, “A Deep Learning Method Combined Sparse Au-
toencoder with SVM,” in 2015 International Conference on Cyber-Enabled
Distributed Computing and Knowledge Discovery, Sept 2015, pp. 257–260.
[113] M. Ali Keyvanrad and M. Homayoonpoor, “A brief survey on deep belief
networks and introducing a new object oriented MATLAB toolbox,” 08 2014,
arXiv:1408.3264.
134