Sharma, Dev (2024) Hyper-parameter Optimization for LSTM models. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (2MB) | Preview |
Preview |
PDF (Configuration Manual)
Download (1MB) | Preview |
Abstract
Due to the high potential of Natural Language Processing (NLP) in various applications the study in this domain has increased rapidly. More precisely the usage of Long Short Term Memory (LSTM) networks for different applications such as text prediction, language modelling and machine translations have risen. But the performance of the LSTM model remains a challenging task despite the effectiveness of the LSTM networks. The current research study does not detail about the dependencies between the hyperparameters and performance of the model which therefore leads to poor performance results. Therefore this research tries to address this gap and thereby increase the real world applicability. This research aims to determine the impact of hyperparameter optimization on LSTM models for NLP applications. In this study the application is text prediction The data which is present in form of text is extracted from a PDF document using the PyPDF2 library and is tokenized with the help of Keras Tokenizer which is offered by TensorFlow. Before the modelling process is started the data is prepared with the help of different pre-processing techniques. An LSTM model is built which consists of number of different layers having different functionalities. Different hyperparameters such as embedding output dimension , LSTM unit, dropout rate and learning rate are taken into consideration. These hyperparameters are optimized with the help of the Random Search approach provided by the Keras Tuner and the performance is evaluated. The results obtained show that there is marginal improvement when compared with baseline model The study will help in providing a comprehensive framework for enhancing the performance of the neural networks in the text prediction applications.
Item Type: | Thesis (Masters) |
---|---|
Supervisors: | Name Email Qayum, Abdul UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Ciara O'Brien |
Date Deposited: | 25 Aug 2025 11:07 |
Last Modified: | 25 Aug 2025 11:07 |
URI: | https://norma.ncirl.ie/id/eprint/8625 |
Actions (login required)
![]() |
View Item |