NORMA eResearch @NCI Library

A Retrieval-Augmented Generation Framework for Medical Question Answer

Hussain, Madni Ali (2024) A Retrieval-Augmented Generation Framework for Medical Question Answer. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (598kB) | Preview

Abstract

Large Language Models (LLMs) can generate textual information that can be factually incorrect. This occurs when the model fails to accurately represent or reason about the real world. Retrieval-Augmented Generation (RAG) allows an LLM to refer to external data during the generation process to enhance and improve the factual correctness and contextual understanding of textual information. As of right now existing generation models are not up to par with academic standards and excessively hallucinate information that they don't have in their training. The challenge is to minimize the hallucination problem of LLM and to generate accurate answers as it is very important in the medical field. This research proposes a RAG framework that uses Large Language Models for accurately generate answers for medical datasets. The framework combines Retrieval-Augmented Generation techniques, LLMs, and knowledge base stored in a vector database. The LLM models are implemented using Llama 3, Gemma2, Mistral, GPT2. The results of these models, integrated with RAG, are evaluated based on the following metrics: 'BLEU', 'ROUGE-1', 'BERT P', 'BERT R', 'BERT F1', 'Perplexity' score. This research will benefit medical researchers and doctors in getting semantically correct medical information without hallucinations.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Stynes, Paul
UNSPECIFIED
Uncontrolled Keywords: LLM; RAG; Data Embeddings; Vector Database
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing
R Medicine > Healthcare Industry
Divisions: School of Computing > Master of Science in Artificial Intelligence
Depositing User: Ciara O'Brien
Date Deposited: 20 Jun 2025 08:27
Last Modified: 20 Jun 2025 08:27
URI: https://norma.ncirl.ie/id/eprint/7952

Actions (login required)

View Item View Item