NORMA eResearch @NCI Library

Football Player Selection Based on Positions and Skills Using Ensemble Machine Learning and Similarity Measure Techniques

Murugappan, Murugappan (2022) Football Player Selection Based on Positions and Skills Using Ensemble Machine Learning and Similarity Measure Techniques. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (991kB) | Preview
[thumbnail of Configuration manual]
PDF (Configuration manual)
Download (1MB) | Preview


The game of football attracts millions of fans across the world and also the production of players that makes a team effective is really hard when it comes to selecting players based on manual procedures by sports and team analysts. This research has performed various ensemble machine learning algorithms that can automatically predict the position of the players with two novel approaches based on different features and performed statistical feature selection techniques to select the top 30 features in predicting the player’s position. Throughout this research, four ensemble machine learning algorithms are performed on which random forest classifier gave the highest accuracy in the approach of predicting only the 4 major positions of the players, and support vector classifier results in better performance in the approach of predicting the 27 different major and minor positions. Also, hyperparameter tuning results in no such huge improvement in both approaches. Also, Used different similarity measure techniques such as cosine similarity and Euclidean distance measures to select the most similar players based on their skills. Finally, cosine similarity performed better compared to Euclidean distance and can be applied in different sports domains where players can be selected based on their skills. The evaluation of all the models has been done with the help of evaluation metrics such as accuracy, precision, recall, F1 score, and cross-validation score. Also, Evaluated both basic and tuned models with the help of the confusion Matrix where the level of truly predicted outcomes and the misclassified outcomes is seen.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
G Geography. Anthropology. Recreation > GV Recreation Leisure > Sports > Soccer
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Tamara Malone
Date Deposited: 23 Feb 2023 16:44
Last Modified: 02 Mar 2023 08:32

Actions (login required)

View Item View Item