Addya, Sufal (2020) Extractive text summarization of image extracted text. Masters thesis, Dublin, National College of Ireland.
Preview |
PDF (Master of Science)
Download (3MB) | Preview |
Preview |
PDF (Configuration manual)
Download (306kB) | Preview |
Abstract
Text summarization is a huge field in text analytics, research is tried to propose an unique approach to find text summarization from images. Optical character recognition using PyTesseract with OpenCV perform very well to extract text from images and research applied two unsupervised extractive text summarization algorithms Textrank and TF-IDF algorithms on that text to find a meaningful summary. This proposed sequence of program pipeline produce a very attractive output with can be applied in future to implement in making text summarization application. Here, Tesseract with OpenCV perform outstanding to extract the text and two extractive summarization algorithm produce a meaningful extractive summary successfully but evaluating accuracy of generated summary is a challenging part of this research which needs to overcome in future.
Item Type: | Thesis (Masters) |
---|---|
Subjects: | Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science Q Science > QA Mathematics > Computer software T Technology > T Technology (General) > Information Technology > Computer software |
Divisions: | School of Computing > Master of Science in Data Analytics |
Depositing User: | Dan English |
Date Deposited: | 22 Jan 2021 10:20 |
Last Modified: | 22 Jan 2021 10:20 |
URI: | https://norma.ncirl.ie/id/eprint/4430 |
Actions (login required)
View Item |