NORMA eResearch @NCI Library

Deep Learning Approaches to Real-Time Sign Language Recognition and Multilingual Translation

Guttula, Harika (2024) Deep Learning Approaches to Real-Time Sign Language Recognition and Multilingual Translation. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (860kB) | Preview

Abstract

The following study proposes the Convolutional Neural Networks (CNNs) - Recurrent Neural Networks (RNNs) hybrid deep learning model for the recognition of American Sign Language (ASL) gesture and real-time multilingual speech translation. The CNN part is used to extract spatial features from the ASL gesture images and the RNN part is used to capture temporal features using Long Short-Term Memory (LSTM) networks. The model is built to recognize 36 classes of ASL gestures including digits 0- 9 and alphabets A-Z and is combined with multilingual speech output module using Google Text-to-Speech (gTTS) that can translate the recognized signs into spoken words in Spanish, French and Arabic.

The model was trained and tested on dataset of 2,515 images of ASL and the performance of the model was calculated on 10 iterations. Training accuracy increased from 27.10% to 97.66% and validation accuracy achieved 99.01%. The test accuracy was 97.61% proving that this model has a very high generalisation capacity. The performance of the classifiers was as follows: precision of 97%, recall of 98%, and F1-score of 97%. However, the metrics of some classes like ‘o’ and ‘z’ are slightly lower and it is obvious that class imbalance and feature overlap issues are the main causes that need to be improved. The ability to output speech in multiple languages is a major advantage and increases the model’s practical usability across the range of people and situations. This research focuses on the application of the proposed hybrid CNN+RNN model in ASL gesture recognition and its possible use in translating real-time sign language to spoken language and vice versa. Additionally, the findings contribute to the development of assistive technologies, offering a solid foundation for the advancement of ASL recognition systems. Future work will focus on addressing minor performance discrepancies and exploring advanced techniques such as data augmentation and specialised loss functions to further optimise the model.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Hamil, David
UNSPECIFIED
Uncontrolled Keywords: Convolutional Neural Networks (CNNs); Recurrent Neural Networks (RNNs); American Sign Language (ASL); LSTM; ASL gesture recognition
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
P Language and Literature > P Philology. Linguistics > Semiotics > Language. Linguistic theory > Gesture. Sign language
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Ciara O'Brien
Date Deposited: 02 Sep 2025 12:37
Last Modified: 02 Sep 2025 12:37
URI: https://norma.ncirl.ie/id/eprint/8708

Actions (login required)

View Item View Item