NORMA eResearch @NCI Library

A Serverless Pipeline Framework with Dynamic Schema Adaptation for Enhanced CSV Processing in AWS

Chokalingam Pillai Subramonia Pillai, Sanjith (2025) A Serverless Pipeline Framework with Dynamic Schema Adaptation for Enhanced CSV Processing in AWS. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (718kB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (487kB) | Preview

Abstract

The rise of cloud computing and serverless computing changed the paradigms of data processing, leading to unparalleled scalability and cost-effectiveness in working with different types of data. Nevertheless, traditional CSV processing workflow falls behind in terms of dynamic schema changes and does not have any quality assurance measures, especially in a serverless setup. The proposed research presents an intelligent serverless framework to handle CSVs that combines the features of real-time schema translation with anomaly detection and quality assurance. Unlike the classical approach of batch processing with the use of AWS Glue crawlers to perform schema detection, our framework utilizes AWS Lambda functions with dynamic python-based inference features eliminating the bottleneck in schema synchronization. AWS Step Functions are also included to dictate multi-stage workflows of validation processes that guarantee the integrity of data in the processing pipeline. In addition, this framework builds an intelligent storage optimization strategy that sends frequently accessed data to AWS Redshift Serverless and cold data to S3 Intelligent Tiering. The performance evaluation covers schema adaptation latency, anomaly detection accuracy, processing throughput, and cost per-GB processed, in comparison with the traditional implementations involving AWS Glue-Redshift solutions.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Samarawickarma, Yasantha
UNSPECIFIED
Subjects: T Technology > T Technology (General) > Information Technology > Cloud computing
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Cloud Computing
Depositing User: Ciara O'Brien
Date Deposited: 20 Mar 2026 14:30
Last Modified: 20 Mar 2026 14:30
URI: https://norma.ncirl.ie/id/eprint/9205

Actions (login required)

View Item View Item