NORMA eResearch @NCI Library

Serverless AI: Leveraging Cloud Functions For GPU-Optimized Machine Learning Deployment and Comparative Analysis with Traditional Methods

Chaudhry, Muhammad Qamar (2025) Serverless AI: Leveraging Cloud Functions For GPU-Optimized Machine Learning Deployment and Comparative Analysis with Traditional Methods. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (694kB) | Preview
[thumbnail of Configuration Manual]
Preview
PDF (Configuration Manual)
Download (731kB) | Preview

Abstract

The fusion of serverless computing and GPU-accelerated machine learning (ML) marks a new approach in AI deployment at the cloud level. Proposed strategies incorporating virtual machines or containers seem to struggle with scaling, cost efficiency, management overhead, and infrastructure. This work analyzes the hybrid serverless architecture based on AWS Lambda and API Gateway for managing inference workloads using GPU-backed EC2 instances. The focus of the research is on two machine learning models - cyberbullying detection model with ML algorithms and an object detection model - YOLOv8, monitoring latency, resource consumption, processing time, and cost. The findings illustrate that lightweight models perform consistently under serverless configurations, but heavy workload models take advantage of dynamic offload to GPU-backed EC2 instances. In addition to facing latency spikes and inconsistent resource utilization, delay-sensitive computing GPU models exhibit tremendous resource needs. Nevertheless, the hybrid model achieves some balance of diminishing returns with performance and cost. The study illustrates the capability of such serverless architectures to configure with lower responsiveness for modern AI workloads, increasing the potential to enable these approaches with suitable requirements.

Item Type: Thesis (Masters)
Supervisors:
Name
Email
Heeney, Sean
UNSPECIFIED
Uncontrolled Keywords: Serverless Computing; Machine Learning Deployment; GPU Acceleration; AWS Lambda; Cloud Computing; Hybrid Architecture; Cost Optimization; Inference Performance; YOLOv8; Cyberbullying Detection
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence
B Philosophy. Psychology. Religion > Psychology > Aggressiveness > Bullying
T Technology > T Technology (General) > Information Technology > Cloud computing
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Machine learning
Divisions: School of Computing > Master of Science in Cloud Computing
Depositing User: Ciara O'Brien
Date Deposited: 21 Nov 2025 14:22
Last Modified: 21 Nov 2025 14:22
URI: https://norma.ncirl.ie/id/eprint/8951

Actions (login required)

View Item View Item