NORMA eResearch @NCI Library

Speech Emotion Recognition using Deep Learning

Chimthankar, Priyanka Prashant (2021) Speech Emotion Recognition using Deep Learning. Masters thesis, Dublin, National College of Ireland.

[img]
Preview
PDF (Master of Science)
Download (2MB) | Preview
[img]
Preview
PDF (Configuration manual)
Download (1MB) | Preview

Abstract

Speech Emotion Recognition (SER) has a broad range of applications and there has been a significant amount of research in this fascinating area in recent years. However, the entertainment sector suffers from a lack of study in this research. The Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) architectures will be utilized to categorize the emotions in audio recordings captured by actors expressing various emotions. An innovative method will be discussed that combines 2D CNN+LSTM with MFCC features extracted from audio data. Multiple experiments are used to determine the reliability of such systems that use deep learning. The model is based on four widely used datasets in SER: SAVEE, RAVDESS, TESS, and CREMA-D, and has a validation accuracy of 67.58%. Additionally, this model was evaluated on an unknown dataset that included audio samples in the German language and achieved a testing accuracy of 71.28%.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science

Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Clara Chan
Date Deposited: 15 Nov 2021 16:59
Last Modified: 15 Nov 2021 16:59
URI: http://norma.ncirl.ie/id/eprint/5142

Actions (login required)

View Item View Item