Hand Gesture Recognition of English Alphabets using Artificial Neural Network

41
Hand Gesture Recognition of English Alphabets using Artificial Neural Network Recent Trends in Information Systems (ReTIS-15), Kolkata S. Bhowmick, S. Kumar and A. Kumar Presented by: A. Kumar Department of Electronics and Communication Engineering Tezpur University, Assam July, 2015 Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 1 / 41

Transcript of Hand Gesture Recognition of English Alphabets using Artificial Neural Network

Hand Gesture Recognition of English Alphabets usingArtificial Neural Network

Recent Trends in Information Systems (ReTIS-15), Kolkata

S. Bhowmick, S. Kumar and A. KumarPresented by: A. Kumar

Department of Electronics and Communication EngineeringTezpur University, Assam

July, 2015Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 1 / 41

Plan of Talk

• Introduction• Motivation• Theoretical Background• Literature Survey• Problems Identified• Proposed Techniques and Approaches• Experimental Results and Discussions• Comparison with Previous Work• Limitations• Conclusion and Future Direction• References

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 2 / 41

Introduction

• Gestures are all sorts of non-verbally communicated information [2].

• Gestures can be facial expressions, limb movements or anymeaningful body state [1].

• Gestures are of two types- static and dynamic.

Figure: Gesture types

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 3 / 41

Motivation

• Free-hand based tracking is still a challenging issue.

• Problems of recognition of similar gestures and movement epenthesisin the continuous sign language conversation are some of the biggestchallenges before the researchers.

• The work is a step towards helping the hearing and speech-impairedpeople.

Figure: Similar gestures and movement epenthesis

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 4 / 41

Theoretical Background

Figure: Gesture recognition system block diagram

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 5 / 41

Theoretical Background (contd.)

• Colour image planes

I RGB, HSV, YCbCr etc.

• Segmentation

I Segmenting the region of interest (hand) from background and other body parts

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 6 / 41

Theoretical Background (contd.)

• Moment or centroid

I A moment is a gross characteristics of a contour computed by integrating or summing over all of the pixels ofthe contour [10]

mp,q =∑

I(x, y)xpyq (1)

(X, Y ) = (m1,0

m0,0

,m0,1

m0,0

) (2)

• Feature extraction

I Orientation, trajectory length and acceleration

• Classification

I Artificial neural network

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 7 / 41

Types of Gesture Recognition System

• Gloved based systems

I Extra sensors provide better accuracy of result.

I Tedious installation/set-up methods and cumbersome to use [2], [3], [5].

• Vision based systems

I Easy capture of the gestures in front of camera.

I No set-up method and naturalness and ease of work [2], [3], [5].

Figure: Glove based and vision based systems

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 8 / 41

Literature Survey

• “Dynamic Hand Gesture Recognition Using Hidden Markov Model," Z.Yang, Y. Li, W. Chen, and Y. Zheng [3].

• “Trajectory Guided Recognition of Hand Gestures having only GlobalMotions," M. K. Bhuyan, P. K. Bora, and D. Ghosh [5].

• “Continuous Gesture Trajectory Recognition System Based onComputer Vision," X. Wenkai, and E. J. Lee [6].

• “Vision-Based Continuous Sign Language Recognition using ProductHMM," S. H. Yu, C. L. Huang, S. C. Hsu, H. W. Lin, and H. W. Wang [7].

• “Gesture Recognition for Alphabets from Hand Motion Trajectory UsingHidden Markov Models," M. Elmezain, A. Hamadi, G. Krell, S. Etriby, andB. Michaelis [8].

• “Evaluation of HMM Training Algorithms for Letter Hand GestureRecognition," N. Liu, B. C. Lovell, P. J. Kootsookos [9].

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 9 / 41

Problems Identified

• Free hand-based tracking• Recognition of similar gestures and movement epenthesis

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 10 / 41

Proposed Techniques and Approaches

• Select the best segmentation model• Consider the best lighting condition

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 11 / 41

Proposed Techniques and Approaches (contd.)

• We have considered three hand-segmentation models

X Image intensity based model

X Background subtraction model

X HSV+YCbCr model based on skin-colour

• Three lighting conditions considered are

X Normal light

X Poor light

X Bright light

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 12 / 41

Results at Various Lighting Conditions

Figure: Results at various lighting conditions

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 13 / 41

Comparison of Results among the Models

Model Normal Poor light Bright lightImage-intensity Working Sensitive WorkingB. subtraction Sensitive Sensitive Working

Skin-colour based HSV+YCbCr Robust Working Robust

• HSV+YCbCr model due to its best performance has been consideredin the project.

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 14 / 41

The Steps/Algorithm of the Adopted System Model

• Input RGB frame is converted to HSV and YCbCr planes,

• Skin region/pixels are extracted by performing thresholding from bothframes,

• Both the frames are converted into binary frames,

• Morphological operations follow to remove unwanted noisy pixels,

• Logical AND operation is performed between the planes to removebackground noise and to get the most connected region,

• Centroids of segmented palm region is calculated using momentcalculation,

• Gesture trajectories are found by connecting centroids of thesegmented region,

• Drawing of alphabets by hand gestures.Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 15 / 41

Flowchart of the System Model

Figure: Flowchart of the system modelAnurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 16 / 41

Feature Extraction Techniques

I The features considered in the work are

• Orientation feature

• Gesture trajectory length

• Velocity and acceleration

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 17 / 41

Feature Extraction Techniques (contd.)

• Orientation feature

θi = arctan(yi − yi−1

xi − xi−1), i = 1, 2, ...N (3)

• Gesture trajectory length [5]

di =√(x′

i − x′i−1)

2 + (y′i − y′i−1)2 (4)

lN =∑√

(x′i − x′

i−1)2 + (y′i − y′i−1)

2 =∑

di (5)

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 18 / 41

Feature Extraction Techniques (contd.)

• Velocity and acceleration

X Velocity (v) is defined as the rate of change of distance with time.

v =ds

dt(6)

X In coordinate system velocity [5] will be determined by

vi = (vx, vy) = (xi+1 − xi

ti+1 − ti,yi+1 − yiti+1 − ti

) (7)

X Acceleration (a) is defined as the rate of change of velocity

a =dv

dt(8)

a = (vi+1 − viti+1 − ti

) (9)

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 19 / 41

Feature Extraction System Model

Figure: Feature extraction block diagramAnurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 20 / 41

Feature Extraction System Model (contd.)

Figure: Feature extraction techniquesAnurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 21 / 41

Experimental Results

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 22 / 41

Network Specifications of MLP for Isolated GestureAlphabets

ANN type MLPTraining algorithm traingda

Maximum number of epochs 200

Input layer size 128

MSE goal 0.0001

Training time 18.05 secondsTesting time 15.86 seconds

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 23 / 41

Isolated Gesture Alphabets

Figure: A few isolated gestures considered in the workAnurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 24 / 41

Confusion Matrix for Isolated Gesture Alphabets

Gesture S T CS RS WS RR (%)=CST × 100%

A 30 30 27 1 2 90.0%B 30 30 26 2 2 86.7%C 30 30 29 1 0 96.7%D 30 30 25 2 3 83.3%E 30 30 29 1 0 96.7%F 30 30 29 0 1 96.7%O 30 30 28 1 1 93.3%Q 30 30 29 1 0 96.7%

• S is the total number of samples trained, T is that of samplestested, CS is that of correctly spotted gestures, RS is that ofrejected gesture samples, WS is that of wrongly spotted gesturesand RR is the recognition rate in %

• Overall recognition rate=(240-18)/240= 92.5%

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 25 / 41

Recognition Rate Results for Isolated GestureAlphabets

No. of trained samples per gesture class 30

No. of tested samples per gesture class 30

No. of validation samples per gesture class 30

No. of gestures taken 8

Total no. of gestures tested 30× 8 = 240

Total misclassified gestures 18

Recognition rate (percent) 222240 = 92.5%

Best result C,E,F,QPoorest result D

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 26 / 41

Hand Gesture Recognition for Continuous GestureAlphabets

• Regular methods or approaches for detecting isolated gesture symbolwill not hold good for continuous gesture as the problem of separationbetween two gestures arise.

• The separation requires another powerful feature such as accelerationto detect the individual symbol boundary.

• It has been observed that velocity (v) remains almost constant for aparticular gesture symbol. So acceleration(a) becomes zero.

If v=constant => a=dv/dt=0

• By selecting a suitable threshold for acceleration, we can detect thesymbol boundary.

• In this work, the acceleration and velocity features have been used tospot the start and end points of an individual gesture in the continuousgestures.Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 27 / 41

Hand Gesture Recognition for Continuous GestureAlphabets (contd.)

• Multi Layer Perceptron (MLP) and Focussed Time Delay NeuralNetwork (FTDNN) are trained with these feature information to recognizethe individual gestures in the continuous gestures.

Figure: Continuous gesture recognition techniques

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 28 / 41

Continuous Gesture Alphabet

Figure: Continuous gesture alphabet BA

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 29 / 41

Network Specifications of MLP for Continuous GestureAlphabets

ANN type MLPTraining algorithm traingda

Maximum number of epochs 200

Input layer size 256

MSE goal 0.0001

Training time 25.45 secondsTesting time 21.56 seconds

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 30 / 41

Confusion Matrix for Continuous Gesture Alphabetswith MLP

Gesture S T CS RS WS RR (%)=CST × 100%

BA 30 30 26 2 2 86.7%AD 30 30 25 2 3 83.3%CA 30 30 28 2 0 93.3%CB 30 30 27 1 2 90.0%CF 30 30 29 1 0 96.7%DB 30 30 26 3 1 86.7%AA 30 30 26 2 2 86.7%

• Overall recognition rate=(210-23)/210= 89.05%

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 31 / 41

Recognition Rate Results for Continuous GestureAlphabets with MLP

No. of trained samples per gesture class 30

No. of tested samples per gesture class 30

No. of validation samples per gesture class 30

No. of gestures taken 7

Total no. of gestures tested 30× 7 = 210

Total misclassified gestures 23

Recognition rate (percent) 187210 = 89.05%

Best result CFPoorest result AD

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 32 / 41

Network Specifications of FTDNN for ContinuousGesture Alphabets

ANN type MLPTraining algorithm traingda

Maximum number of epochs 200

Input layer size 256

MSE goal 0.0001

Training time 26.07 secondsTesting time 21.34 seconds

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 33 / 41

Confusion Matrix for Continuous Alphabets withFTDNN

Gesture S T CS RS WS RR (%)=CST × 100%

BA 30 30 25 1 4 83.3%AD 30 30 25 3 2 83.3%CA 30 30 27 1 2 90.0%CB 30 30 28 2 0 93.3%CF 30 30 29 0 1 96.7%DB 30 30 24 4 2 80.0%AA 30 30 25 1 4 83.3%

• Overall recognition rate=(210-27)/210= 87.14%

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 34 / 41

Recognition Rate Results for Continuous GestureAlphabets with FTDNN

No. of trained samples per gesture class 30

No. of tested samples per gesture class 30

No. of validation samples per gesture class 30

No. of gestures taken 7

Total no. of gestures tested 30× 7 = 210

Total misclassified gestures 27

Recognition rate (percent) 183210 = 87.14%

Best result CFPoorest result DB

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 35 / 41

Discussions

� The combined features of orientation and gesture trajectory length forisolated gestures efficiently deal with the problem of recognition ofsimilar gestures.

� The use of velocity and acceleration feature enable us to recognizethe continuous gestures and in the process the problem of movementepenthesis is removed.

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 36 / 41

Comparison with Previous Work

• In [6], recognition rate obtained for isolated Arabic numerals (0-9) is93.72% using Hidden Markov Model (HMM-FNN) and is increased to95.76% while employing HMM-FNN (Fuzzy Neural Network) hybridframework.

• HMM and Conditional Random Field (CRF) are more frequently usedin pattern recognition purpose compared to ANN, and in [6] only isolatedgestures have been considered, and therefore the recognition rateobtained in this work can be said to be satisfactory.

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 37 / 41

Limitations

� The method is illumination sensitive.

� Presence of skin-colour background deteriorates the result.

� The gesturer must always wear full-sleeve cloth.

� Presence of multiple gesturer aggravates the result.

� The method is not for real-time gesture recognition system.

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 38 / 41

Conclusion and Future Direction

� A free hand based hand gesture recognition system for both isolatedand continuous gestures has been developed.

� The problems of recognizing similar gestures and movementepenthesis have been eradicated.

� Scope of future work will be to develop and implement the work inbare hand without wearing full sleeves cloth.

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 39 / 41

References[1] S. Mitra, and T. Acharya, “Gesture Recognition: A Survey, in IEEE Transactions on Systems, Man, andCyberneticsPart C: Applications and Reviews, vol. 37, no. 3, pp. 311-324, 2007.

[2] M. E. Al-Ahdal, and N. M. Tahir, “Review in Sign Language Recognition Systems", in IEEE Symposium on Computersand Informatics, pp. 52-57, 2002

[3] Z. Yang, Y. Li, W. Chen, and Y. Zheng, “Dynamic Hand Gesture Recognition Using Hidden Markov Model", in 7thInternational Conference on Computer Science Education, pp. 360-365, 2012.

[4] J. Appenrodt, A. Hamadi, and B. Michaelis,“Data Gathering for Gesture Recognition Systems Based on Single Color,Stereo Color and Thermal Cameras", in International Journal of Signal Processing, Image Processing and PatternRecognition, vol.3, pp. 37-50, 2010.

[5] M. K. Bhuyan, P. K. Bora, and D. Ghosh, “Trajectory Guided Recognition of Hand Gestures having only GlobalMotions", in World Academy of Science, Engineering and Technology, vol. 21, pp.753-764, 2008.

[6] X. Wenkai, and E. J. Lee, “Continuous Gesture Trajectory Recognition System Based on Computer Vision", inInternational Journal of Applied Mathematics and Information Science, pp. 339-346, 2012.

[7] S. H. Yu, C. L. Huang, S. C. Hsu, H. W. Lin, and H. W. Wang, “Vision-Based Continuous Sign Language Recognitionusing Product HMM", in First Asian Conference on Pattern Recognition (ACPR), pp. 510-514, 2011.

[8] M. Elmezain, A. Hamadi, G. Krell, S. Etriby, and B. Michaelis, “Gesture Recognition for Alphabets from Hand MotionTrajectory Using Hidden Markov Models", in The 7th IEEE International Symposium on Signal Processing and InformationTechnology, pp. 1209-1214, 2007.

[9] N. Liu, B. C. Lovell, P. J. Kootsookos, “Evaluation of HMM Training Algorithms for Letter Hand Gesture Recognition", inProceedings of 3rd IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), pp.648-651, 2003.

[10] G. Bradski, and A. Kaehler, Learning OpenCV, 1st Ed., OReilly Media Inc., 2008.

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 40 / 41

Thank You

Anurag Kumar et al. ReTIS-15 (Paper Id. 246) July 10, 2015 41 / 41