NORMA eResearch @NCI Library

Clustering Based Approach to Enhance Association Rule Mining

Kanhere, Samruddhi Shailesh (2020) Clustering Based Approach to Enhance Association Rule Mining. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration manual]
PDF (Configuration manual)
Download (1MB) | Preview


Association rule mining algorithms such as Apriori and FPGrowth are extensively being used in the retail industry to uncover consumer buying patterns. However, the scalability of these algorithms to deal with the voraciously increasing data is the major challenge. This research presents a novel Clustering based approach by reducing the dataset size as a solution. The products are clustered based on their frequency and price. Another important aspect of this study is to find interesting rules by performing differential market basket analysis to identify association rules which are likely ignored in the trivial approach. When using a cluster-based approach, it is observed that the same set of rules can be generated by using only 7% of the total 16210 items, which in turn directly contributes to reducing the processing overheads and thus reducing the computation time. Furthermore, results obtained from differential market basket analysis have highlighted a few interesting rules which were missing from the original set of rules. A clustering-based approach used in this study not only consists of frequent items but also considers their contribution to the overall revenue generation by considering its price thus adding value to the state of the art. In addition to this, the least contributing product exclusion rate is also improved from 45% to 93%. These results evidently suggest that the computation cost can be significantly reduced, and more accurate rules can be generated by applying differential market basket analysis. What effects does the product exclusion cause on the business is the question that must be addressed in further researches?

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Dan English
Date Deposited: 20 Jan 2021 16:19
Last Modified: 20 Jan 2021 16:19

Actions (login required)

View Item View Item