Download - Hand Gesture Recognition of English Alphabets using Artificial Neural Network

Hand Gesture Recognition of English Alphabets usingArtificial Neural Network

Recent Trends in Information Systems (ReTIS-15), Kolkata

S. Bhowmick, S. Kumar and A. KumarPresented by: A. Kumar

Department of Electronics and Communication EngineeringTezpur University, Assam

July, 2015Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 1 / 41

Plan of Talk

• Introduction• Motivation• Theoretical Background• Literature Survey• Problems Identified• Proposed Techniques and Approaches• Experimental Results and Discussions• Comparison with Previous Work• Limitations• Conclusion and Future Direction• References

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 2 / 41

Introduction

• Gestures are all sorts of non-verbally communicated information [2].

• Gestures can be facial expressions, limb movements or anymeaningful body state [1].

• Gestures are of two types- static and dynamic.

Figure: Gesture types


Motivation

• Free-hand based tracking is still a challenging issue.

• Problems of recognition of similar gestures and movement epenthesisin the continuous sign language conversation are some of the biggestchallenges before the researchers.

• The work is a step towards helping the hearing and speech-impairedpeople.

Figure: Similar gestures and movement epenthesis


Theoretical Background

Figure: Gesture recognition system block diagram


Theoretical Background (contd.)

• Colour image planes

I RGB, HSV, YCbCr etc.

• Segmentation

I Segmenting the region of interest (hand) from background and other body parts


Theoretical Background (contd.)

• Moment or centroid

I A moment is a gross characteristics of a contour computed by integrating or summing over all of the pixels ofthe contour [10]

mp,q =∑

I(x, y)xpyq (1)

(X, Y ) = (m1,0

m0,0

,m0,1

m0,0

) (2)

• Feature extraction

I Orientation, trajectory length and acceleration

• Classification

I Artificial neural network


Types of Gesture Recognition System

• Gloved based systems

I Extra sensors provide better accuracy of result.

I Tedious installation/set-up methods and cumbersome to use [2], [3], [5].

• Vision based systems

I Easy capture of the gestures in front of camera.

I No set-up method and naturalness and ease of work [2], [3], [5].

Figure: Glove based and vision based systems


Literature Survey

• “Dynamic Hand Gesture Recognition Using Hidden Markov Model," Z.Yang, Y. Li, W. Chen, and Y. Zheng [3].

• “Trajectory Guided Recognition of Hand Gestures having only GlobalMotions," M. K. Bhuyan, P. K. Bora, and D. Ghosh [5].

• “Continuous Gesture Trajectory Recognition System Based onComputer Vision," X. Wenkai, and E. J. Lee [6].

• “Vision-Based Continuous Sign Language Recognition using ProductHMM," S. H. Yu, C. L. Huang, S. C. Hsu, H. W. Lin, and H. W. Wang [7].

• “Gesture Recognition for Alphabets from Hand Motion Trajectory UsingHidden Markov Models," M. Elmezain, A. Hamadi, G. Krell, S. Etriby, andB. Michaelis [8].

• “Evaluation of HMM Training Algorithms for Letter Hand GestureRecognition," N. Liu, B. C. Lovell, P. J. Kootsookos [9].


Problems Identified

• Free hand-based tracking• Recognition of similar gestures and movement epenthesis


Proposed Techniques and Approaches

• Select the best segmentation model• Consider the best lighting condition


Proposed Techniques and Approaches (contd.)

• We have considered three hand-segmentation models

X Image intensity based model

X Background subtraction model

X HSV+YCbCr model based on skin-colour

• Three lighting conditions considered are

X Normal light

X Poor light

X Bright light


Results at Various Lighting Conditions

Figure: Results at various lighting conditions


Comparison of Results among the Models

Model Normal Poor light Bright lightImage-intensity Working Sensitive WorkingB. subtraction Sensitive Sensitive Working

Skin-colour based HSV+YCbCr Robust Working Robust

• HSV+YCbCr model due to its best performance has been consideredin the project.


The Steps/Algorithm of the Adopted System Model

• Input RGB frame is converted to HSV and YCbCr planes,

• Skin region/pixels are extracted by performing thresholding from bothframes,

• Both the frames are converted into binary frames,

• Morphological operations follow to remove unwanted noisy pixels,

• Logical AND operation is performed between the planes to removebackground noise and to get the most connected region,

• Centroids of segmented palm region is calculated using momentcalculation,

• Gesture trajectories are found by connecting centroids of thesegmented region,

• Drawing of alphabets by hand gestures.Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 15 / 41

Flowchart of the System Model

Figure: Flowchart of the system modelAnurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 16 / 41

Feature Extraction Techniques

I The features considered in the work are

• Orientation feature

• Gesture trajectory length

• Velocity and acceleration


Feature Extraction Techniques (contd.)

• Orientation feature

θi = arctan(yi − yi−1

xi − xi−1), i = 1, 2, ...N (3)

• Gesture trajectory length [5]

di =√(x′

i − x′i−1)

2 + (y′i − y′i−1)2 (4)

lN =∑√

(x′i − x′

i−1)2 + (y′i − y′i−1)

2 =∑

di (5)


Feature Extraction Techniques (contd.)

• Velocity and acceleration

X Velocity (v) is defined as the rate of change of distance with time.

v =ds

dt(6)

X In coordinate system velocity [5] will be determined by

vi = (vx, vy) = (xi+1 − xi

ti+1 − ti,yi+1 − yiti+1 − ti

) (7)

X Acceleration (a) is defined as the rate of change of velocity

a =dv

dt(8)

a = (vi+1 − viti+1 − ti

) (9)


Feature Extraction System Model

Figure: Feature extraction block diagramAnurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 20 / 41

Feature Extraction System Model (contd.)

Figure: Feature extraction techniquesAnurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 21 / 41

Experimental Results


Network Specifications of MLP for Isolated GestureAlphabets

ANN type MLPTraining algorithm traingda

Maximum number of epochs 200

Input layer size 128

MSE goal 0.0001

Training time 18.05 secondsTesting time 15.86 seconds


Isolated Gesture Alphabets

Figure: A few isolated gestures considered in the workAnurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 24 / 41

Confusion Matrix for Isolated Gesture Alphabets

Gesture S T CS RS WS RR (%)=CST × 100%

A 30 30 27 1 2 90.0%B 30 30 26 2 2 86.7%C 30 30 29 1 0 96.7%D 30 30 25 2 3 83.3%E 30 30 29 1 0 96.7%F 30 30 29 0 1 96.7%O 30 30 28 1 1 93.3%Q 30 30 29 1 0 96.7%

• S is the total number of samples trained, T is that of samplestested, CS is that of correctly spotted gestures, RS is that ofrejected gesture samples, WS is that of wrongly spotted gesturesand RR is the recognition rate in %

• Overall recognition rate=(240-18)/240= 92.5%


Recognition Rate Results for Isolated GestureAlphabets

No. of trained samples per gesture class 30

No. of tested samples per gesture class 30

No. of validation samples per gesture class 30

No. of gestures taken 8

Total no. of gestures tested 30× 8 = 240

Total misclassified gestures 18

Recognition rate (percent) 222240 = 92.5%

Best result C,E,F,QPoorest result D


Hand Gesture Recognition for Continuous GestureAlphabets

• Regular methods or approaches for detecting isolated gesture symbolwill not hold good for continuous gesture as the problem of separationbetween two gestures arise.

• The separation requires another powerful feature such as accelerationto detect the individual symbol boundary.

• It has been observed that velocity (v) remains almost constant for aparticular gesture symbol. So acceleration(a) becomes zero.

If v=constant => a=dv/dt=0

• By selecting a suitable threshold for acceleration, we can detect thesymbol boundary.

• In this work, the acceleration and velocity features have been used tospot the start and end points of an individual gesture in the continuousgestures.Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 27 / 41

Hand Gesture Recognition for Continuous GestureAlphabets (contd.)

• Multi Layer Perceptron (MLP) and Focussed Time Delay NeuralNetwork (FTDNN) are trained with these feature information to recognizethe individual gestures in the continuous gestures.

Figure: Continuous gesture recognition techniques


Continuous Gesture Alphabet

Figure: Continuous gesture alphabet BA


Network Specifications of MLP for Continuous GestureAlphabets




MSE goal 0.0001



Confusion Matrix for Continuous Gesture Alphabetswith MLP


BA 30 30 26 2 2 86.7%AD 30 30 25 2 3 83.3%CA 30 30 28 2 0 93.3%CB 30 30 27 1 2 90.0%CF 30 30 29 1 0 96.7%DB 30 30 26 3 1 86.7%AA 30 30 26 2 2 86.7%



Recognition Rate Results for Continuous GestureAlphabets with MLP








Best result CFPoorest result AD


Network Specifications of FTDNN for ContinuousGesture Alphabets




MSE goal 0.0001



Confusion Matrix for Continuous Alphabets withFTDNN


BA 30 30 25 1 4 83.3%AD 30 30 25 3 2 83.3%CA 30 30 27 1 2 90.0%CB 30 30 28 2 0 93.3%CF 30 30 29 0 1 96.7%DB 30 30 24 4 2 80.0%AA 30 30 25 1 4 83.3%



Recognition Rate Results for Continuous GestureAlphabets with FTDNN








Best result CFPoorest result DB


Discussions

� The combined features of orientation and gesture trajectory length forisolated gestures efficiently deal with the problem of recognition ofsimilar gestures.

� The use of velocity and acceleration feature enable us to recognizethe continuous gestures and in the process the problem of movementepenthesis is removed.


Comparison with Previous Work

• In [6], recognition rate obtained for isolated Arabic numerals (0-9) is93.72% using Hidden Markov Model (HMM-FNN) and is increased to95.76% while employing HMM-FNN (Fuzzy Neural Network) hybridframework.

• HMM and Conditional Random Field (CRF) are more frequently usedin pattern recognition purpose compared to ANN, and in [6] only isolatedgestures have been considered, and therefore the recognition rateobtained in this work can be said to be satisfactory.


Limitations

� The method is illumination sensitive.

� Presence of skin-colour background deteriorates the result.

� The gesturer must always wear full-sleeve cloth.

� Presence of multiple gesturer aggravates the result.

� The method is not for real-time gesture recognition system.


Conclusion and Future Direction

� A free hand based hand gesture recognition system for both isolatedand continuous gestures has been developed.

� The problems of recognizing similar gestures and movementepenthesis have been eradicated.

� Scope of future work will be to develop and implement the work inbare hand without wearing full sleeves cloth.


References[1] S. Mitra, and T. Acharya, “Gesture Recognition: A Survey, in IEEE Transactions on Systems, Man, andCyberneticsPart C: Applications and Reviews, vol. 37, no. 3, pp. 311-324, 2007.

[2] M. E. Al-Ahdal, and N. M. Tahir, “Review in Sign Language Recognition Systems", in IEEE Symposium on Computersand Informatics, pp. 52-57, 2002

[3] Z. Yang, Y. Li, W. Chen, and Y. Zheng, “Dynamic Hand Gesture Recognition Using Hidden Markov Model", in 7thInternational Conference on Computer Science Education, pp. 360-365, 2012.

[4] J. Appenrodt, A. Hamadi, and B. Michaelis,“Data Gathering for Gesture Recognition Systems Based on Single Color,Stereo Color and Thermal Cameras", in International Journal of Signal Processing, Image Processing and PatternRecognition, vol.3, pp. 37-50, 2010.

[5] M. K. Bhuyan, P. K. Bora, and D. Ghosh, “Trajectory Guided Recognition of Hand Gestures having only GlobalMotions", in World Academy of Science, Engineering and Technology, vol. 21, pp.753-764, 2008.

[6] X. Wenkai, and E. J. Lee, “Continuous Gesture Trajectory Recognition System Based on Computer Vision", inInternational Journal of Applied Mathematics and Information Science, pp. 339-346, 2012.

[7] S. H. Yu, C. L. Huang, S. C. Hsu, H. W. Lin, and H. W. Wang, “Vision-Based Continuous Sign Language Recognitionusing Product HMM", in First Asian Conference on Pattern Recognition (ACPR), pp. 510-514, 2011.

[8] M. Elmezain, A. Hamadi, G. Krell, S. Etriby, and B. Michaelis, “Gesture Recognition for Alphabets from Hand MotionTrajectory Using Hidden Markov Models", in The 7th IEEE International Symposium on Signal Processing and InformationTechnology, pp. 1209-1214, 2007.

[9] N. Liu, B. C. Lovell, P. J. Kootsookos, “Evaluation of HMM Training Algorithms for Letter Hand Gesture Recognition", inProceedings of 3rd IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), pp.648-651, 2003.

[10] G. Bradski, and A. Kaehler, Learning OpenCV, 1st Ed., OReilly Media Inc., 2008.


Thank You