Quora Insincere Question Classification with word Embedding Algorithms

Suryawanshi, Namrata Ashok

Quora Insincere Question Classification with word Embedding Algorithms

Tools

Suryawanshi, Namrata Ashok (2023) Quora Insincere Question Classification with word Embedding Algorithms. Masters thesis, Dublin, National College of Ireland.

Preview	PDF (Master of Science) Download (892kB) \| Preview
Preview	PDF (Configuration manual) Download (517kB) \| Preview

Abstract

This report presents a comprehensive study on the application of DL models for predicting the sincerity of questions on the Quora platform. The primary objective is to differentiate between sincere and insincere questions, facilitating content moderation and enhancing user experience. The study explores various model architectures, including CNN+LSTM+GRU and BILSTM, with and without attention mechanisms and GloVe. The models are implemented and evaluated using performance metrics such as precision, accuracy, F1-score, recall, sensitivity and specificity. Among the models examined, the BILSTM model with an added layer of attention mechanism and pre-trained GloVe embeddings emerges as the best-performing model, achieving an impressive validation accuracy of 89.10%. This highlights the significance of attention mechanisms and pretrained embeddings in enhancing model performance. The findings demonstrate the effectiveness of DL approaches for classifying question sincerity on Quora. The results hold significant implications for content moderation and user engagement on social media platforms. Additionally, the study identifies potential areas for further research, such as exploring different embeddings, ensembling techniques, and addressing class imbalances to improve model performance. This research contributes useful insights regarding the use of DL techniques for content analysis and classification on social media platforms and sets the stage for future advancements in this domain.

Item Type:	Thesis (Masters)
Supervisors:	Name Email Menghwar, Teerath Kumar UNSPECIFIED
Uncontrolled Keywords:	CNN+LSTM+GRU; BILSTM; Attention Mechanism; Pre-trained Word Embeddings
Subjects:	Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > World Wide Web > Websites T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > World Wide Web > Websites
Divisions:	School of Computing > Master of Science in Data Analytics
Depositing User:	Tamara Malone
Date Deposited:	08 Jan 2025 16:48
Last Modified:	08 Jan 2025 16:48
URI:	https://norma.ncirl.ie/id/eprint/7282

Actions (login required)

View Item