NORMA eResearch @NCI Library

Sentiment Analysis of Tweets to Classify the Box Office Success of Movies

Guda, Kapardhi Kumar (2018) Sentiment Analysis of Tweets to Classify the Box Office Success of Movies. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (905kB) | Preview


The movie making industry is one of the top money-making machines one can say based on the revenues being generated every year from movies, a whopping 11 billion dollars were total box-office collections of Hollywood movies for the year 2017. The main aim objective of the project is to able to predict the movies success whether its going to be a "flop","average", "hit" or "superhit" which would be beneficial for the movie distributors and production companies. Tweets related to multiple movies were extracted, sentiments and emotions were extracted for these tweets using R language, and a score is given for each emotion and sentiments, before running the models. All precautions were taken to hide the identifiable information like Tweet Id and Screen name. Machine learning models used in the project were SVM, KNN, Naive Bayes and Random Forrest. The models were run in Python language, they were able to predict the class of the movies whether its a "flop", "average", "hit" or "superhit" with an accuracy of 60.76%, based on the scores of emotions and sentiments, so this would be extremely useful for the entertainment industry for boosting their revenues. After comparison to other models performance, SVM and KNN did a good job in predicting the outcome of the movies. After analysing the results, it was understood that the movies with higher budget need larger revenues for their success because of the additional expenditures these movies have like Print and Advertising.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
H Social Sciences > HD Industries. Land use. Labor > Specific Industries > Film Industry
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources > The Internet > World Wide Web > Websites > Online social networks
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Telecommunications > The Internet > World Wide Web > Websites > Online social networks
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Caoimhe Ní Mhaicín
Date Deposited: 03 Nov 2018 14:23
Last Modified: 03 Nov 2018 14:23

Actions (login required)

View Item View Item