Dhullipalla, Tulasiram (2024) Detection of suicidal content in the social media posts using advanced predictive classifiers. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (1MB) | Preview |
Abstract
Suicide is a serious global health concern. Every year, approximately 800,000 people take their own lives. which means one person dies by suicide every 40 seconds. So early detection and finding individuals with suicidal thoughts are important to save their lives. Identifying a suicidal ideation person is crucial and the first step in preventing lives. Traditionally, individual data, who have suicidal thoughts, were collected by doctors through direct interaction. However, nowadays it is difficult to gather data directly from individuals. In the modern era, social media usage has increased drastically. Statista reports that 98 percent of individuals aged 15 to 24 in Europe use social media(internet). Therefore, social networks have a significant advantage in terms of early detection and identification of suicidal thoughts in a particular person to save his life. The objective of this research work is to design and implement advanced predictive models to identify the suicidal thoughts on the social media posts. Analyzing the posts using ML (Machine learning) and Ann (Artificial neural network) that help to understand the emotion of the individual during the post on social media. This analyzed information helps doctors, psychologists, and parents to treat suffering people properly. For the achieve the project aim, Long Short-Term Memory (LSTM), Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM), Support Vector Machine (SVM), and Logistic Regression models were implemented to identify whether this post is a suicidal thought or not. Data was collected from Kaggle (Reddit, Twitter). After the data collection, preprocessing the data is done, which means cleaning, tokenization such as TF_IDF, word embeddings, and analysis. For the Reddit dataset, the accuracy of LSTM, CNN+LSTM, Logistic regression, and SVM achieved 93.23%, 92.74%, 93.17%, and 75.28% respectively. For Twitter, model accuracy of LSTM, CNN+LSTM, Logistic regression, and SVM achieved 98.57%, 98.57%, 97.92%, and 98.65% respectively. The results indicate that CNN-LSTM performed well compared with traditional machine learning methods. This study shows the importance of machine learning and artificial neural networks to improve the health monitoring system by analyzing social media posts in real-time and immediate action to prevent suicide.
Actions (login required)
![]() |
View Item |