Velishetty, Nagaraju (2023) Personal Identifiable Information (PII) Detection and Identification for Fintech with AI and Text Analytics. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration manual)
Download (1MB) | Preview |
Abstract
The detection of Personally Identifiable Information (PII) in text datasets is a critical task to safeguard privacy and ensure data protection. This abstract provides an overview of the application of Named Entity Recognition (NER) algorithms, particularly the BERT NER model, for PII detection on Unstructured text dataset.
It introduces an innovative approach that combines deep learning techniques with rule-based methods to identify PII in unstructured text. The experiments conducted in this study demonstrate the effectiveness of the proposed model in accurately detecting PII entities. By integrating deep learning algorithms with rule-based methods, the model exhibits high accuracy in identifying PII, contributing to enhanced privacy and data security.
It proposes a hybrid model for PII detection, which combines a deep learning-based NER model with rule-based patterns. Through evaluation, we demonstrate that the model achieves high precision and recall when detecting PII in text datasets. This hybrid approach capitalizes on the strengths of both deep learning and rule-based methods, providing a robust solution for PII detection.
Moreover, one of the discussed resources focuses on practical techniques for PII detection. It emphasizes the utilization of pre-trained language models, such as BERT, and the importance of fine-tuning these models using domain-specific datasets. It highlights the significance of understanding the contextual nuances and specific types of PII relevant to the targeted domain. By leveraging pre-trained language models and finetuning them, the accuracy of PII detection can be significantly improved.
This paper emphasizes the importance of PII detection in text datasets and explores various approaches to address this task. The combination of deep learning techniques with rule-based methods, as well as the utilization of pre-trained language models and fine-tuning, are presented as effective strategies for accurately identifying PII entities and ensuring privacy and data protection.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Del Rosal, Victor UNSPECIFIED |
Uncontrolled Keywords: | Safeguard; PII entities; Data security; Deep learning; Privacy |
Subjects: | Q Science > QA Mathematics > Computer software > Computer Security T Technology > T Technology (General) > Information Technology > Computer software > Computer Security H Social Sciences > HG Finance > Fintech T Technology > T Technology (General) > Information Technology > Fintech Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning |
Divisions: | School of Computing > Master of Science in FinTech |
Depositing User: | Tamara Malone |
Date Deposited: | 10 Aug 2024 09:57 |
Last Modified: | 10 Aug 2024 09:57 |
URI: | https://norma.ncirl.ie/id/eprint/7037 |
Actions (login required)
View Item |