Islam, Md Raisul, Islam, Mohammad Monirul, Rakib, MD. Aminul Islam, Anik, Md. Shihab Mahmud, Kundu, Amit Kumar and Tuhin, Md. Golam Morshed (2026) Building a Balanced Deepfake Dataset: Aligned Faces for Robust Model Training and Evaluation. In: 2025 IEEE 4th International Conference on Robotics, Automation, Artificial-Intelligence and Internet-of-Things (RAAICON). IEEE, Dhaka, Bangladesh, pp. 486-491. ISBN 979-8-3315-9282-0
Full text not available from this repository.Abstract
Deepfake media - highly realistic AI-generated face swaps - pose a growing threat to the authenticity of digital content. As a contribution towards research into detecting such forgeries, we introduce a deepfake video dataset that is large-scale and well-annotated. It comprises 480 videos (110,694 frames) which are equally split between genuine and forged content. Deepfake videos were generated with two modern face-swap workflows (open-source Roop Face-Swapper and commercial Akool AI tool), worked on the videos of 30 volunteers (15 males, 15 females). Frames were downsampled at 5 fps, and a face detector (MTCNN) was used for cropping and aligning a main face to extract 3, 7 4 6 real-face images and 1 0 6, 9 4 8 fake-face images. It is demographically and generation-method balanced which can serve as a diverse reference for deepfake detection. Ethical clearance was gained from Daffodil International University. The whole dataset, including images, labels and metadata, is accessible to the research community. Our dataset is unique from previous works such as FaceForensics++ and Celeb-DF due to the demographic diversity, multi-pipeline generation, and transparent creation pipeline. To illustrate its effectiveness, a custom CNN classifier trained on the dataset achieved an accuracy of 97.8% at differentiating between real and fake faces. For the sake of transparency and reproducibility, we present full information of the recording, generation, and preprocessing pipeline.
Actions (login required)
![]() |
View Item |
Tools
Tools