NORMA eResearch @NCI Library

Analysis of suicide ideation documents posted on Twitter using an NLP classifier

Swamy, Rachana Devnur (2022) Analysis of suicide ideation documents posted on Twitter using an NLP classifier. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (879kB) | Preview
[thumbnail of Configuration manual]
PDF (Configuration manual)
Download (1MB) | Preview


One of the top causes of mortality, suicide currently accounts for 828,000 fatalities worldwide every year, up from 712,000 in 1990. As a result, suicide is currently the ninth leading cause of death worldwide. The number of research suggesting that social media and the Internet may have an impact on actions related to suicide is also growing. To assess whether or not a text contains suicidal ideas, researchers created a very simple suicidal ideation classification using Natural Language Processing (NLP), a subfield of Machine Learning (ML). Social networks, which allow users to contact friends and family, are crucial tools for learning people's viewpoints on many subjects. In recent years, the prominence of the study issue in the domains of NLP and psychology has greatly increased: the identification of suicidal thoughts through online social network analysis. The complex early signs of suicidal thoughts may be recognised with the appropriate use of social media data, possibly saving countless lives. Everyone was able to contact and share their thoughts and emotions with millions of people worldwide thanks to the quick growth of social media websites and technology. Social media websites like Google+, Instagram, Facebook, Twitter, and LinkedIn have become essential communication conduits. Users of these websites may produce, distribute, and receive information among a large group of individuals. Although these platforms have benefits, there are certain user safety concerns related to how they are established and how disclosing suicidal ideas are handled. Machine learning models have been employed for the classification of the tweets posted as positive or negative tweets. The models are evaluated using the evaluation parameters. The Support Vector Machine (SVM) classifier outperforms other conventional ML methods in terms of model performance, with an accuracy score of 92% and a 0.92 F1 score. The performance of the Stochastic Gradient Descent (SGD), and Logistic Regression (LR) classifiers was somewhat worse than that of the SV classifier, which achieved an accuracy score of between 91% and 91% and an F1 score of 0.91.

Item Type: Thesis (Masters)
Horn, Christian
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing
R Medicine > RA Public aspects of medicine > RA790 Mental Health
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > World Wide Web > Websites > Online social networks
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > World Wide Web > Websites > Online social networks
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Tamara Malone
Date Deposited: 27 May 2023 10:44
Last Modified: 27 May 2023 10:44

Actions (login required)

View Item View Item