NORMA eResearch @NCI Library

A Deep Learning Framework Integrating CNN, BiLSTM, and Attention for Multi-Label Text Classification in News

Cai, Wanpin (2025) A Deep Learning Framework Integrating CNN, BiLSTM, and Attention for Multi-Label Text Classification in News. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (2MB) | Preview

Abstract

In recent years, multi-label text classification (MLTC) has garnered widespread attention in areas such as news recommendation, legal decision prediction, and public opinion monitoring. Existing methods still have the limitations in handling label imbalance and semantic dependencies. This study proposes a hybrid deep learning framework based on CNN-BiLSTM-Attention for multi-label classification of news text. This framework utilizes CNN to extract local semantic features, in BiLSTM to model contextual dependencies, and an attention mechanism to highlight the semantic information most relevant to a particular label. We empirically evaluate the proposed model on the classic Reuters-21578 dataset and the newly constructed MN-DS dataset. Experimental results show that the proposed model performs well on high-frequency labels (such as earn), achieving an F1 score significantly superior to baseline methods. However, its overall performance on low-frequency labels and the MN-DS dataset is limited (F1 score of only 0.35). Further analysis reveals that uneven class distribution and insufficient embedding representation are the primary reasons. Nevertheless, this study demonstrates the potential of the CNN–BiLSTM–Attention framework in capturing the multi-layered semantic features of news text and reveals some label dependencies through interpretable attention weights. Future work will focus on improving imbalance handling methods (such as resampling and focal loss), introducing label semantic enhancement, and contrastive learning to enhance the model’s generalization across datasets and in low-resource scenarios.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Raj, Kislay
UNSPECIFIED
Uncontrolled Keywords: Multi-label text classification(MLTC); CNN-BiLSTM-Attention; Deep learning
Subjects: Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence
P Language and Literature > P Philology. Linguistics > Computational linguistics. Natural language processing
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Artificial Intelligence
Depositing User: Ciara O'Brien
Date Deposited: 28 May 2026 13:14
Last Modified: 28 May 2026 13:14
URI: https://norma.ncirl.ie/id/eprint/9314

Actions (login required)

View Item View Item