NORMA eResearch @NCI Library

Spelling and Grammatical Error Detection for Informal Turkish Texts with Morphologically Sensible Models

Bayram, Onur (2024) Spelling and Grammatical Error Detection for Informal Turkish Texts with Morphologically Sensible Models. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration manual]
Preview
PDF (Configuration manual)
Download (884kB) | Preview

Abstract

Turkish is a morphologically rich language with unique characteristics such as agglutination and vowel harmony. This makes it challenging to create efficient spelling and grammatical error detection models for informal Turkish texts. Existing perspectives in the field are not enough to consider the unique characteristics of Turkish language, especially for informal texts, leading to poor precision. In this research, the project proposes to develop and discuss a morphologically sensible sequential deep learning models to aim spelling and grammatical error detection for informal Turkish texts. The proposed models are recurrent neural networks (RNN), Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Bidirectional GRU (Bi-GRU), and Bidirectional LSTM (Bi-LSTM); the models are all types of neural network architectures, especially designed for handling sequential data. They benefit for informal Turkish-specific tasks. The models are trained and tested equal to two million rows dataset, consisting of both formal and informal Turkish texts from Turkish news, Wikipedia, and Twitter. Each of the proposed models has an accuracy of 97%. Detailed results of the 5 proposed models are presented in this paper based on classification report, confusion matrix, accuracy-loss plots, and discussion. The proposed models are highly effective to fill the void in Turkish natural language processing and improving the accuracy of spelling and grammatical error detection for informal Turkish texts. The research also checks and displays misspelled words for the put into practice informal texts with 5 text experiments, one case study for each of the proposed models, in the implementation section.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Syed, Muslim Jameel
UNSPECIFIED
Uncontrolled Keywords: Natural Language Processing; Neural Network Architectures; Spelling and Grammatical Error Detection; Text Analysis; Turkish Language
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence
P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing
Divisions: School of Computing > Master of Science in Artificial Intelligence
Depositing User: Tamara Malone
Date Deposited: 03 Apr 2025 16:36
Last Modified: 03 Apr 2025 16:36
URI: https://norma.ncirl.ie/id/eprint/7359

Actions (login required)

View Item View Item