NORMA eResearch @NCI Library

Movie Spoilers Classification Over Online Commentary, Using Bi-LSTM Model With Pre-trained GloVe Embeddings

Lindo, Anyelo (2020) Movie Spoilers Classification Over Online Commentary, Using Bi-LSTM Model With Pre-trained GloVe Embeddings. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (1MB) | Preview


In the past few decades, society and it’s ways of living has been reshaped in order to adapt to the ongoing technological breakthroughs. Perhaps, what was not use to be much of an issue in the past, now can be a problem. Movie spoilers in particular, have now become a matter of concern for cinema fanatics and the film industry itself. Moreover, freedom of speech on virtual communities where film fanatics meet, can be a source of dread as user-generated data comes to be challenging for moderation. Due to the fact that these commentary reviews may contain revelatory information associated to movies plot, that could ruin real-time cinema experience for thousands of movie-goers who enjoy attending films screening. Ergo, impacting the cinematic industry revenues as well. In this wise, we proposed a supervised deep learning model that will serve as foundation for future work on this field. Using Bidirectional Long Short-Term Memory(Bi-LSTM) with pre-trained Global Vectors (GloVe) to improve the accuracy in the text classification, as well as the training speed; so to deal with spoilers over online commentaries about movie reviews. Additionally, for testing purposes we used two different well-known methods to extract features from text for modeling: Bag Of Words (BOW) and Term Frequency-Inverse Document Frequency (TF-IDF); so as to build four extra classifiers: Support Vector Machine (SVM), Logistic Regression (LR), Bernoulli Naive Bayes (NB) and Random Forest (RF). Results obtained were satisfactory and speak on behalf of our proposed solution. On the grounds that our model not only achieved good skill to discriminate between classes compared to the rest of the classifiers, but also completed the training in a fairly short time.
Keywords: Bi-LSTM, Deep Learning, GloVe, NLP, Spoilers, Supervised, Text Classification, Word Embeddings.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
H Social Sciences > HD Industries. Land use. Labor > Specific Industries > Film Industry
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > World Wide Web > Websites
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > World Wide Web > Websites
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Dan English
Date Deposited: 23 Jun 2020 12:45
Last Modified: 23 Jun 2020 12:45

Actions (login required)

View Item View Item