NORMA eResearch @NCI Library

Classification of Cancer Gene Variants Stages using Ensemble and Deep Learning Approaches

Rajasekar, Ramyaa (2023) Classification of Cancer Gene Variants Stages using Ensemble and Deep Learning Approaches. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration manual]
PDF (Configuration manual)
Download (1MB) | Preview


The advancement of technology and its integration with healthcare have had a positive impact on the world. Among several diseases, Cancer has had a significant impact on society in recent years, but it is also found that the threats can be reduced by implementing artificial intelligence to help medical professionals make an early diagnosis using cutting-edge technology in classifying the cancer stages on a patient’s genetic history based on the clinical evidence to provide individualized treatments in a time-efficient manner. Therefore, to automate the manual process handled by clinical experts in classifying the genetic mutations to a specific cancer class using MSKCC gene data chosen from Kaggle, traditional machine learning models - Logistic Regression, K nearest neighbor (KNN), Random Forest (RF), Support Vector Machine (SVM), Gradient Boosting (GB), Majority Voting Ensemble Classifier, and in deep learning model - Long Short Term Memory(LSTM) was built and the model performances are compared. Different techniques such as Natural Language Processing (NLP), Word2Vec word embedding technique, and hyperparameter tuning were implemented to increase the classification prediction. Finally, the models built are evaluated using metrics such as Accuracy, Recall, F1 score, and Log loss. Based on the evaluation metrics Voting Ensemble Classifier attained better accuracy of 69% with minimal log loss of 0.89.

Item Type: Thesis (Masters)
Ul Ain, Qurrat
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
R Medicine > RC Internal medicine > RC0254 Neoplasms. Tumors. Oncology (including Cancer)
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Tamara Malone
Date Deposited: 24 May 2023 17:40
Last Modified: 24 May 2023 17:40

Actions (login required)

View Item View Item