Konan Ravi, Rohith (2024) A Smart Cloud – Based Document Search Engine for Query Retrieval Using Large Learning Models (LLM's). Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (3MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (1MB) | Preview |
Abstract
A set of documents grows rather fast; at the moment, there are more than 140 million documents, and this number increases every year, so, retrieving documents should be effective. Today’s issues are connected with the utilization of special language, relationships between documents, and imprecise queries uttered by users. Sophisticated NLP approaches such as semantic search and embeddings are central to fixing most these problems. This paper focuses on the possibility of populating text-to-text transformers such as Google T5 and BART Large models refinished for summarizing and retrieving purposes. Through fine-tuning, the authors observed improved performance in BART Large model with ROUGE-1 scores increasing from 0.269 to 0.461 and improved unigram overlap and context relevance. Moreover, the application of models such as Sentence Encoder and FastText demonstrated a near perfect of 98% and 96% of retrieval accuracy, respectively, which was more efficient than the traditional TF-IDF and Count Vectorizer models. Thus utilizing cloud-native architectures along with databases such as MySQL or FAISS, the system enables accurate and efficient document search on a large-scale. This research offers an ideal foundation for most contemporary semantic search architectures that answer user expectations of accuracy and value.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Siddig, Abubakr UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Cloud computing P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing |
Divisions: | School of Computing > Master of Science in Cloud Computing |
Depositing User: | Ciara O'Brien |
Date Deposited: | 15 Jul 2025 13:36 |
Last Modified: | 15 Jul 2025 13:36 |
URI: | https://norma.ncirl.ie/id/eprint/8114 |
Actions (login required)
![]() |
View Item |