NORMA eResearch @NCI Library

An Eclectic Approach for Predicting Customer Segmentation to Facilitate Market Basket Analysis

-, Ashwini Mohan (2022) An Eclectic Approach for Predicting Customer Segmentation to Facilitate Market Basket Analysis. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration manual]
PDF (Configuration manual)
Download (3MB) | Preview


Data mining technologies are being applied to large amounts of consumer data to gain useful insights about customers, implement customer segmentation for marketing purposes, understand individual purchasing behaviour, create recommendation systems, and so on, all of which, when implemented, can help the organization maximize its profitability. The primary objective of this research was to combine machine learning with data mining techniques, to forecast customer segments based on the transaction data. The secondary objective of this study is an extension of the earlier research and a novel approach conducted in the same domain and seeks to evaluate the hypothesis that the application of association mining rule on segmented customer data is more effective compared to its application on the entire dataset. An Online retail dataset II having 1,067,371 transactional records was used in this research, which was divided into 4 clusters based on the RFM analysis and K-Means clustering technique. The initial objective of this research was tested against four classification models, KNN, Decision Tree, Light Gradient Boosted Machine (LGBM) to perform multiclass classification. On the basis of training time and K-Fold Validation accuracy, Random Forest was deemed as the best model with a K-Fold test accuracy of 85.15% and training time of 0.1935 seconds. The results obtained for the second objective of this research led to rejecting the hypothesis, since no recommendations were generated for two of the classes. The approach undertaken in this research is effective in categorizing customer to their respective groups, producing a satisfactory outcome and provided information about the best practices to embed in the topic to get better results.

Item Type: Thesis (Masters)
Staikopoulos, Athanasios
Uncontrolled Keywords: Association Mining Rule; Decision Tree; K-Means Clustering; KNN Classifier; LGBM; Random Forest; RFM Analysis
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
H Social Sciences > HF Commerce > Marketing > Consumer Behaviour
H Social Sciences > HF Commerce > Electronic Commerce
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Tamara Malone
Date Deposited: 17 Jan 2023 17:11
Last Modified: 16 Mar 2023 14:55

Actions (login required)

View Item View Item