Predicting Evoked Expression from Videos Using Convolutional LSTM

Anandhapadmanaban, Kishore Kumar

Predicting Evoked Expression from Videos Using Convolutional LSTM

Tools

Anandhapadmanaban, Kishore Kumar (2023) Predicting Evoked Expression from Videos Using Convolutional LSTM. Masters thesis, Dublin, National College of Ireland.

Preview	PDF (Master of Science) Download (4MB) \| Preview
Preview	PDF (Configuration manual) Download (5MB) \| Preview

Abstract

Evoked expressions in videos, representing spontaneous emotional reactions, are fundamental to understanding human emotions and intentions. Predicting these expressions poses significant challenges due to the intricate interplay of spatial, temporal, and auditory aspects in videos. Traditional methods often rely on unimodal strategies, utilizing either visual or auditory cues, and might employ conventional model architectures Convolutional Neural Network (CNN) or Visual Geometry Group (VGG-16), which often fail to capture the multifaceted nature of expressions. In this research the data is from the Evoked Expressions from Videos (EEV) dataset, the study involves several diligent stages. The collection of pictures from videos is followed by the production of spectrograms from the matching audio, resulting in an extensive visualization of data. This research introduces a novel multimodal approach, amalgamating audio, and visual cues, and implements a Convolutional Long-Short Term Memory (Conv-LSTM) model to capitalize on both spatial and temporal dimensions. Our approach overcomes existing limitations by integrating the spatial feature extraction capabilities of CNNs with the sequential modelling strengths of Long Short-Term Memory (LSTM). Other than that, our research compares the performance of our proposed model with Custom CNN model and VGG-16 algorithm. Rigorous evaluations demonstrate that our model outperforms existing methods in terms of validation loss, Mean Square Error, and Mean Absolute Error, offering a robust and effective solution for evoked expression prediction from videos. The Mean absolute error of the Conv-LSTM is 0.197 and Mean Square error is 0.064 and evaluation on validation loss is 0.64. The Conv-LSTM overperforms the other two models.

Item Type:	Thesis (Masters)
Supervisors:	Name Email Kumar Meghwar, Teerath UNSPECIFIED
Uncontrolled Keywords:	Evoked Expression; EVV dataset; CNN; Conv-LSTM; VGG-16; Mean Square Error; Mean Absolute Error; Validation Error
Subjects:	Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence > Computer vision Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence > Computer vision B Philosophy. Psychology. Religion > Psychology > Emotions
Divisions:	School of Computing > Master of Science in Data Analytics
Depositing User:	Tamara Malone
Date Deposited:	08 Nov 2024 11:49
Last Modified:	08 Nov 2024 11:49
URI:	https://norma.ncirl.ie/id/eprint/7168

Actions (login required)

View Item