Ghadge, Srushti Prakash (2022) Electricity Theft Detection Using Machine Learning Algorithms: China. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (901kB) | Preview |
Preview |
PDF (Configuration manual)
Download (1MB) | Preview |
Abstract
This research is based on the investigation of the consumer’s actual power consumption to identify the fraud consumer who steals the energy from the power lines, which can cause an imbalance in a power line which most affects the other consumers, utility companies, and the government, for this research purpose. These smart energy meters collected the consumer dataset at a regular time interval. The dataset collected by the state grid corporation of China (SGCC) is utilized in this research work to identify the fraud consumer from the bunch of honest consumers. As this research work goes into depth about the machine learning and deep learning algorithms and pattern recognition problem, it found that for the machine learning and deep learning model, when the dataset is imbalanced, it becomes ought for the machine learning and deep learning models to learn accurately about both class features and characteristics. So, to solve this issue, this research paper proposed the Hybrid algorithm of deep learning and machine learning boosting algorithm, which is known for managing imbalanced datasets, and the deep learning algorithm is utilized to extract more hidden features from the data so machine learning model can classify the information and correct fraud consumer data information can be collected. To balance and manage the imbalanced dataset, class weights were added during the model training instead of over and under-sampling the dataset, destroying the dataset’s actual characteristics. This proposed methodology in this research successfully identifies the fraud consumer by around 96%, and the model can achieve an AUC score of approximately 96%. This research also included the implementation of models for comparison, including random forest, K means clustering, and Decision Tree.
Item Type: | Thesis (Masters) |
---|---|
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Electricity Supply Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Tamara Malone |
Date Deposited: | 24 Jan 2023 17:31 |
Last Modified: | 03 Mar 2023 12:05 |
URI: | https://norma.ncirl.ie/id/eprint/6126 |
Actions (login required)
View Item |