Sign and human action detection using deep learning.

Tools

Dhulipala, S., Adedoyin, F. and Bruno, A., 2022. Sign and human action detection using deep learning. Journal of Imaging, 8 (7), 192.

Full text available as:

Preview

PDF (OPEN ACCESS ARTCILE)
jimaging-08-00192.pdf - Published Version
Available under License Creative Commons Attribution.
5MB

[thumbnail of jimaging-1651181_proofread 2.pdf]

PDF
jimaging-1651181_proofread 2.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Attribution.
1MB

Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk.

Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material.

DOI: 10.3390/jimaging8070192

Abstract

Human beings usually rely on communication to express their feeling, and ideas and solve disputes among them. A major component required for effective communication is language. Language can occur in different forms, including written symbols, gestures, or even vocals. It is usually essential for all the communicating parties to be fully conversant with a common language that they are using. However, this hasn’t been the case between speech impaired people who use sign language and the regular people in the society who use spoken languages. Different studies have pointed out a significant gap between these people and the regular people, limiting the ease of communication. Therefore, this study aims to develop an efficient deep learning model that can be used to predict British sign language. This is in an attempt to narrow this communication gap between the speech-impaired people and the regular people in the community. Two models were developed in the research, which includes CNN and LSTM, and their performance was evaluated using a multi-class confusion matrix. The CNN model emerged with the highest performance, attaining training, and testing accuracies of 98.8% and 97.4%, respectively. The model also achieved average weighted precession, and recall was also 97% and 96%, respectively. On the other hand, the LSTM model’s performance was quite poor, with maximum training and testing, the achieved performance is 49.4% and 48.7% respectively. The research concluded that the CNN model was the best for recognizing and determining British sign language.

Item Type:	Article
ISSN:	2313-433X
Uncontrolled Keywords:	CNN; LSTM; confusion matrix; British sign language; precision; recall
Group:	Faculty of Science & Technology
ID Code:	36858
Deposited By:	Symplectic RT2
Deposited On:	09 May 2022 10:21
Last Modified:	12 Jul 2022 11:09

Downloads

Downloads per month over past year

More statistics for this item...

Repository Staff Only -