Understanding the importance of enhanced logical reasoning among large language models with the help of hybrid symbolic architecture

Chinnameda, Surya Prakash

Understanding the importance of enhanced logical reasoning among large language models with the help of hybrid symbolic architecture

Tools

Chinnameda, Surya Prakash (2025) Understanding the importance of enhanced logical reasoning among large language models with the help of hybrid symbolic architecture. Masters thesis, Dublin, National College of Ireland.

Preview	PDF (Master of Science) Download (1MB) \| Preview
Preview	PDF (Configuration Manual) Download (282kB) \| Preview

Abstract

The hybrid symbolic neural architectures would provide an approach to overcome the small step reasoning and explainability of large language models (LLMs). This study hence attempts to solve the problem of automatic classification of logical reasoning questions into five categories: Combinatorics, Probability & Statistics, Logic Puzzles, Math Word Problems, and General Reasoning for efforts toward AI models that understand reasoning. (Liang et al., 2025; Hóu, 2025) A hybrid approach involving a transparent rule-based keyword categorizer combined with machine learning (TF-IDF + Logistic Regression) and deep learning (DistilBERT) was used. The rule-based part makes use of curated keyword and phrase lists per category such that bootstrapping labels from unstructured question text becomes very fast.

Category distribution, class imbalance, and text length statistics were analysed. To reflect more information on the longer end of the questions, the maximum token length was increased, and since training is easier to balance than validation or testing, only training was balanced using a WeightedRandomSampler. Experiment results prove that the hybrid (Asimit et al., 2022), XGBoost (Shari et al., 2021)DistilBERT plus rules model is better than the baseline in MacroF1 score, especially for the minority classes of Logic Puzzles and Probability & Statistics. Metrics such as Accuracy, Precision, Recall, and per-class F1 show that this approach can attain high accuracy while keeping interpretability due to its symbolic component.

This paper demonstrates the added value that symbolic rules and neural models can bring to the reasoning question classification in terms of accuracy, as well as a scalable process that can be reproduced for the same NLP categorization task (Rudin et al., 2022; Song et al., 2025).

Item Type:	Thesis (Masters)
Supervisors:	Name Email Thomas, Lavish UNSPECIFIED
Subjects:	Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions:	School of Computing > Master of Science in Artificial Intelligence
Depositing User:	Ciara O'Brien
Date Deposited:	28 May 2026 13:28
Last Modified:	28 May 2026 13:28
URI:	https://norma.ncirl.ie/id/eprint/9316

Actions (login required)

View Item