Ravichandran, Shoban (2025) A Transformer-based Framework to Automatically Refactor Legacy Code. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (1MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (831kB) | Preview |
Abstract
Legacy code refactoring involves systematically restructuring outdated code bases to improve maintainability, readability, and extensibility while preserving existing functionality. Refactoring legacy code is a critical challenge in software engineering due to the accumulation of technical debt and the future cost incurred to repair the code. This research proposes a Transformer-based framework that combines Retrieval Augmented Generation with transformer-based language models and NSGA-II optimization for automated Python code refactoring. This research proposes a transformer-based framework combining Retrieval Augmented Generation with multi-objective optimization for automated Python code refactoring. The framework integrates multiple state-of-the-art language models through NSGA-II optimization to balance competing quality objectives. Results demonstrate significant improvements in refactoring quality and system efficiency. NSGAII optimization successfully identified Llama3-70B-8192 as the optimal model configuration, achieving significant improvements across multiple metrics: Answer Relevance (+2.00%), Response Completeness (+7.61%), and CodeBLEU Score (+69.23%). The optimization also improved system efficiency with reduced latency by 10.4% and decreased token usage by 24.9%. The framework achieved an overall 13.78% improvement in core quality metrics while maintaining 100% success rate across all test cases. This research aligns with the goals of Sustainable Development Goal 9 (SDG9), which emphasizes resilient infrastructure and sustainable industrial growth. This research benefits software development teams by reducing manual refactoring effort, organizations by minimizing code repair remediation costs, and society by enabling more resilient software infrastructure.
| Item Type: | Thesis (Masters) |
|---|---|
| Supervisors: | Name Email Stynes, Paul UNSPECIFIED |
| Uncontrolled Keywords: | Code Refactoring; RAG Framework; Multi-objective Optimization; NSGA-II; Python; Large Language Models |
| Subjects: | Q Science > QA Mathematics > Computer software T Technology > T Technology (General) > Information Technology > Computer software Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing |
| Divisions: | School of Computing > Master of Science in Artificial Intelligence |
| Depositing User: | Ciara O'Brien |
| Date Deposited: | 04 Jun 2026 15:19 |
| Last Modified: | 04 Jun 2026 15:19 |
| URI: | https://norma.ncirl.ie/id/eprint/9343 |
Actions (login required)
![]() |
View Item |
Tools
Tools