Diktekoppa Thimmappa, Darshan (2019) Paragraph Vector based Sarcasm Detection in Text. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (977kB) | Preview |
Abstract
Sarcasm is one of the never ending challenge in natural language processing and hinders the process of obtaining the true opinion of the people. It is an escape for the people who don't want to share their true opinion. Often it is used to tease others when people like or dislike something. Sentiment analysis will be incomplete without sarcasm detection. Many researchers worked on this problem using different ways like, non machine learning, machine learning and deep learning based techniques using two ways of feature engineering that is manual feature engineering or by using word embedding. In this research a novel adoption of two models of paragraph vectors that is Distributed Bag-of-Words based paragraph vectors (PV DBOW) and Distributed Memory based paragraph vectors (PV DM) for sarcasm detection will be made over previously followed techniques like Bag-of-Words (BoW), Term Frequency-Inverse Document Frequency (TF-IDF) and Word2Vec considering the fact that these old word embedding techniques cannot capture word order but paragraph vector does. Three different machine learning model Random Forest(RF), Support Vector Classifier (SVC) and Logistic Regression (LR) will be trained and tested for identifying sarcasm. Along with this, manual features will also be taken into consideration to check if the performance of different models vary based on the feature engineering technique involved.
Item Type: | Thesis (Masters) |
---|---|
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science Q Science > QA Mathematics > Computer software T Technology > T Technology (General) > Information Technology > Computer software |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Caoimhe Ní Mhaicín |
Date Deposited: | 11 Oct 2019 12:58 |
Last Modified: | 11 Oct 2019 12:58 |
URI: | https://norma.ncirl.ie/id/eprint/3847 |
Actions (login required)
View Item |