Guttula, Harika (2024) Deep Learning Approaches to Real-Time Sign Language Recognition and Multilingual Translation. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (860kB) | Preview |
Abstract
The following study proposes the Convolutional Neural Networks (CNNs) - Recurrent Neural Networks (RNNs) hybrid deep learning model for the recognition of American Sign Language (ASL) gesture and real-time multilingual speech translation. The CNN part is used to extract spatial features from the ASL gesture images and the RNN part is used to capture temporal features using Long Short-Term Memory (LSTM) networks. The model is built to recognize 36 classes of ASL gestures including digits 0- 9 and alphabets A-Z and is combined with multilingual speech output module using Google Text-to-Speech (gTTS) that can translate the recognized signs into spoken words in Spanish, French and Arabic.
The model was trained and tested on dataset of 2,515 images of ASL and the performance of the model was calculated on 10 iterations. Training accuracy increased from 27.10% to 97.66% and validation accuracy achieved 99.01%. The test accuracy was 97.61% proving that this model has a very high generalisation capacity. The performance of the classifiers was as follows: precision of 97%, recall of 98%, and F1-score of 97%. However, the metrics of some classes like ‘o’ and ‘z’ are slightly lower and it is obvious that class imbalance and feature overlap issues are the main causes that need to be improved. The ability to output speech in multiple languages is a major advantage and increases the model’s practical usability across the range of people and situations. This research focuses on the application of the proposed hybrid CNN+RNN model in ASL gesture recognition and its possible use in translating real-time sign language to spoken language and vice versa. Additionally, the findings contribute to the development of assistive technologies, offering a solid foundation for the advancement of ASL recognition systems. Future work will focus on addressing minor performance discrepancies and exploring advanced techniques such as data augmentation and specialised loss functions to further optimise the model.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Hamil, David UNSPECIFIED |
Uncontrolled Keywords: | Convolutional Neural Networks (CNNs); Recurrent Neural Networks (RNNs); American Sign Language (ASL); LSTM; ASL gesture recognition |
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science P Language and Literature > P Philology. Linguistics > Semiotics > Language. Linguistic theory > Gesture. Sign language |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Ciara O'Brien |
Date Deposited: | 02 Sep 2025 12:37 |
Last Modified: | 02 Sep 2025 12:37 |
URI: | https://norma.ncirl.ie/id/eprint/8708 |
Actions (login required)
![]() |
View Item |