Redmond, Stephen and Rozaki, Eleni (2017) Using bipartite graphs projected onto two dimensions for text classification. In: Fifth International Conference on Advances in Computing, Communication and Information Technology - CCIT 2017, 2-3 September 2017, Zurich, Switzerland.
Preview |
PDF
- Published Version
Download (1MB) | Preview |
Abstract
In our Big Data world, the amount of text being gathered is ever expanding. For many years, data curators have sought ways to group thes e documents and identify common topics. As the size of the problem increases, solutions that will scale are needed . The purpose of this work is to present a novel text classifier that can be used for text - mining and interactive information access. The mode l that is demonstrated can be used to extract hierarchical relations between topics , as well as to conducted unsupervised clustering of documents and keywords. The approach that is taken with this model is the use of a graph - of - words key term extraction an d a dimensional projection of the bipartite graph of documents and key terms. This projection makes it possible for terms to be co - clustered in an efficient manner in relation to their documents and the documents in relation to their terms. Furthermore, t h e key term extraction process that is outlined can be scaled on a large corpus using a distributed processing system such as Apache Spark, and the resultant model can be visually interacted with by users.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Subjects: | Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4150 Computer Network Resources |
Divisions: | School of Computing > Staff Research and Publications |
Depositing User: | Timothy Lawless |
Date Deposited: | 17 Aug 2018 11:07 |
Last Modified: | 17 Aug 2018 11:07 |
URI: | https://norma.ncirl.ie/id/eprint/3071 |
Actions (login required)
View Item |