NORMA eResearch @NCI Library

Natural Language Processing: Minimizing Bias and Misunderstanding for Al Models in Understanding and Generating Human-like Text

Verma, Chetna (2024) Natural Language Processing: Minimizing Bias and Misunderstanding for Al Models in Understanding and Generating Human-like Text. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration manual]
Preview
PDF (Configuration manual)
Download (773kB) | Preview

Abstract

This research presents a new approach to improve natural language processing (NLP) models by integrating context-aware tokenisation, primarily focusing on the GPT-2 architecture. A long-standing problem in natural language processing (NLP), word meaning disambiguation is explicitly addressed in this approach. Containing our method, the model’s ability to understand and generate text containing ambiguous terms is significantly improved by generating context-relevant tokens. Our approach substantially improves the model’s ability to interpret and generate text containing vague terms by generating context-specific tokens. This improvement is essential when dealing with polysemous words, as it allows the model to understand and use each word’s context more effectively. Our results show significantly improved model performance across various NLP tasks, such as text production and semantic analysis. This development offers new research directions in advanced text analysis and promises more accurate and context-aware language models. This advance demonstrates the potential of more precise and context-aware language models and opens new avenues for cutting-edge text processing and understanding research.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Jain, Mayank
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence
P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing
Divisions: School of Computing > Master of Science in Artificial Intelligence
Depositing User: Tamara Malone
Date Deposited: 07 Apr 2025 10:54
Last Modified: 07 Apr 2025 10:54
URI: https://norma.ncirl.ie/id/eprint/7377

Actions (login required)

View Item View Item