Smart Data Masking using AI in Banking Transactions

Yelmar, Urmila Shridhar

Smart Data Masking using AI in Banking Transactions

Tools

Yelmar, Urmila Shridhar (2025) Smart Data Masking using AI in Banking Transactions. Masters thesis, Dublin, National College of Ireland.

Preview	PDF (Master of Science) Download (523kB) \| Preview
Preview	PDF (Configuration Manual) Download (370kB) \| Preview

Abstract

The accelerated digitalisation of the banking sphere has complicated the task of maintaining the privacy of the customer without deteriorating the effectiveness of the fraud and intrusion detection tools. The thesis presents the context-sensitive, AI-assisted adaptive data masking framework that combines real-time risk evaluation and SHAP-based feature prioritisation with denoising autoencoder-based synthetic reconstruction. The aim is to ensure that the predictive performance is maintained and minimise the threat of re-identification to a large extent. Success is an area under the curve (AUC) of at least 0.95 on the task of detecting fraud, with the membership inference attack (MIA) accuracy at least 50% lower than baseline, on average over a bootstrap run.

The framework is tested against three datasets in different financial and cybersecurity contexts IEEE-CIS Fraud Detection, PaySim mobile transactions, and CICIDS2017 network intrusion traces where the train-test split is sealed before computation of SHAP to avoid label leakage. The sensitivity-tiered masking rules directly associate SHAP importance thresholds to masking actions and make them reproducible. Privacy is measured in terms of black-box MIAs and shadow models, k-anonymity scores as well as Kolmogorov Smirnov (KS) statistical tests; utility is gauged in terms of accuracy, precision, recall, F1-score, and AUC.

Results indicate that the suggested technique will cause a drop in average MIA accuracy on the masked data to about 46% (p < 0.05) when the accuracy on the unmasked data was about 90 percent, with the k-anonymity raising to at least 15. Simultaneously, fraud/intrusion detection models achieve 85 to 88 percent accuracy and nearly perfect precision and recall, exceeding zero-masking and random masking baselines, which lose more utility and cause less privacy gain. Distributional tests also bear out that reconstructed values are not identical to original (KS p < 0.01), which reduces risk of leakage at the expense of model interpretability.

The study presents a transparent, operationally viable, and empirically verified method of privacy-preserving machine learning in financial sector, in which regulatory compliance, explainability, and adversarial robustness can be balanced without a loss of predictive utility.

Item Type:	Thesis (Masters)
Supervisors:	Name Email Aleburu, Joel UNSPECIFIED
Uncontrolled Keywords:	Adaptive data masking; Privacy-preserving machine learning; Membership inference attacks; Explainable AI; SHAP values; Autoencoders; Financial fraud detection; k-anonymity
Subjects:	H Social Sciences > HG Finance Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence H Social Sciences > HG Finance > Banking Q Science > QA Mathematics > Computer software > Computer Security T Technology > T Technology (General) > Information Technology > Computer software > Computer Security Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions:	School of Computing > Master of Science in Cyber Security
Depositing User:	Ciara O'Brien
Date Deposited:	17 Jun 2026 09:38
Last Modified:	17 Jun 2026 09:38
URI:	https://norma.ncirl.ie/id/eprint/9383

Actions (login required)

View Item