NORMA eResearch @NCI Library

Improvising Processing of Huge Real Time Data Combining Kafka and Spark Streaming

Lingalwar, Jeevantika (2019) Improvising Processing of Huge Real Time Data Combining Kafka and Spark Streaming. Masters thesis, Dublin, National College of Ireland.

[img]
Preview
PDF (Master of Science)
Download (3MB) | Preview

Abstract

Cloud Computing era brings lots of new innovations in computing technologies which includes processing of data, storage, network access, internet security, web data etc. Knowledge Discovery plays a key role in inventions and abstractions of new knowledge. World Wide Web also called as WWW has now became a hub of these new inventions. Due to the increase in data WWW became overloaded because of which extracting of data and processing the same got complex. Data must be processed quickly, with the goal that a firm can respond to changing business conditions proactively. Stream processing is the ongoing processing of information simultaneously with its creation. It is a perfect platform for processing information streams such as log data, sensor information etc. There are number of alternatives additionally to do constant preparing over information like Spark, Kafka Stream, Storm etc. In this paper the proposed framework for processing huge real-time data is Apache Spark Streaming combined with Apache Kafka. The execution time is recorded after each experiment to get a clear view about the processing speed. Based on the experiments performed it has been concluded that Spark alone takes less time to processes huge amount of data whereas this framework i.e. Kafka combined with Spark execution time depends on the amount of dataset taken.
Keywords: Cloud Computing, MapReduce, Hadoop, Apache Kafka, Apache Spark, Java

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science

Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software

T Technology > T Technology (General) > Information Technology > Cloud computing
Divisions: School of Computing > Master of Science in Cloud Computing
Depositing User: Dan English
Date Deposited: 09 Jun 2020 09:12
Last Modified: 09 Jun 2020 09:12
URI: http://norma.ncirl.ie/id/eprint/4249

Actions (login required)

View Item View Item