Packiaraj, Jacob Benny (2024) Integrating audio and text data with deep learning to detect depression. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (875kB) | Preview |
Preview |
PDF (Configuration Manual)
Download (2MB) | Preview |
Abstract
Major depressive disorder is a mental health condition caused due to stress, trauma, biological and medical conditions affecting millions around the world. Timely diagnosis and treatment are crucial in preventing the condition from worsening and helping individuals regain control over their lives. Traditionally, depression was identified through clinical interviews to ascertain his or her mental condition. This study explores a novel approach for automatic depression detection to classify by combing textual and auditory modalities using a meta-learning technique using DAIC-WOZ dataset. The features extracted from transcript data is used in Bi-GRU/ Bi-LSTM and a hyperparameter tuned CNN model to capture and train audio related features. The prediction from these models is then integrated using meta learner to enhance the classification accuracy. Natural language processing is used to extract the features from transcript file and features like Mel-frequency Cepstral Coefficients, Chroma and Mel-frequency from the audio file is extracted to train the proposed models. Initial result shows that using text features from Bi-GRU/ Bi-LSTM combining with the audio features by CNN has a notable performance improvement compared to the unimodal classification. Features trained in Bidirectional Long short-term memory and the Convolutional neural network can accurately classify majority of the instances with 94% accuracy and the combined model can accurately identify the depressed class with an F1 score of 0.92. This implies that combination of both the auditory and textual information is indeed helpful in the detection of depression as our method takes advantage of these additional sources of information to make the result more accurate and reliable. Potential future work could include the use of other features and modalities to investigate the classification of depressive disorders and integrate the models with cross language and cross-cultural contexts.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Milosavljevic, Vladimir UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning R Medicine > RA Public aspects of medicine > RA790 Mental Health |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Ciara O'Brien |
Date Deposited: | 25 Aug 2025 08:40 |
Last Modified: | 25 Aug 2025 08:40 |
URI: | https://norma.ncirl.ie/id/eprint/8602 |
Actions (login required)
![]() |
View Item |