Face Swapping Attack Detection Using Multistream Feature Embeddings

Kotov, G. I.; Nakov, O. N.; Lazarova, M. K.; Nakov, P. O.; Georgiev, G. P.

Autors: Kotov, G. I., Nakov, O. N., Lazarova, M. K., Nakov, P. O., Georgiev, G. P.
Title: Face Swapping Attack Detection Using Multistream Feature Embeddings
Keywords: AI-generated faces, deepfake detection, face swapping, feature embeddings, multi-stream

Abstract: The paper presents a novel multi-stream feature embedding approach to detect face swapping attacks and distinguish them from fully AI-generated face images. The suggested multi-stream feature embedding architecture combines global features from Vision Transformers, local artifacts from convolutional neural networks, and frequency-domain analysis. Thus the model captures complementary forensic cues often missed by single-stream approaches. The processing pipeline employs image preprocessing and face alignment stage to standardize the input data as required by the tree different streams utilized in the suggested architecture. Attention-based modular fusion is applied for the feature embeddings extracted from each stream to generate unified representation with dynamic weighting thus improving both robustness and interpretability of the model as it selectively relies on the most informative stream based on the input's generative characteristics. Based on the suggested approach a classification model is trained for three classes: real, face-swapped, AI-generated images using diverse mix of images from both curated and real-world datasets. The experimental results demonstrate strong performance of the model that effectively detects AI-generated faces using global and spectral anomalies, while identifying face swaps through localized inconsistencies.

References

“Deepfake Statistics 2023,” Home Security Heroes, Available online: https://www.homesecurityheroes.com/ state-of-deepfakes (accessed 18 June 2025).
J. Boyle, “Study shows most of us can't tell difference between human face and one created by AI,” The Sunday Post, Available online: https://www.sundaypost.com/fp/aifaces (accessed 18 June 2025).
S. Barrington, E. A. Cooper, and H. Farid, “People are poorly equipped to detect AI-powered voice clones,” Scientific Reports, vol. 15, no. 1, 2025.
S. Ahmed, “Examining public perception and cognitive biases in the presumed influence of deepfakes threat: empirical evidence of third person perception from three studies,” Asian Journal of Communication, vol. 33, no.3, 2023, pp. 308-331.
“Sumsub Identity Fraud Report,” Sumsub, Available online: https://sumsub.com/fraud-report-2024 (accessed 18 June 2025).
“What are deepfakes, and how can you spot them?,” Sumsub, Available online: https://sumsub.com/blog/what-are-deepfakes (accessed 18 June 2025).
I. Goodfellow, et. al., “Generative Adversarial Networks,” Advances in Neural Information Processing Systems (NIPS), vol. 27, 2014.
T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4401-4410.
J. Ho, A. Jain, and P. Abbeel. (2020). Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems (NeurIPS), 33, 6840-6851.
R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10684-10695.
C. Saharia, et. al, “Photorealistic text-to-image diffusion models with deep language understanding,” Advances in Neural Information Processing Systems (NeurIPS), vol. 35, 2022, pp. 19572-19586.
A. Ramesh, et. al., “Zero-shot text-to-image generation,” Proc. of the 38th International Conference on Machine Learning, 2021, pp. 8821-8831.
“SynthID,” Google DeepMind, Available online: https://deepmind.google/science/synthid (accessed 18 June 2025).
K. Liu, et. al., “DeepFaceLab: Integrated, flexible and extensible face-swapping framework,” Pattern Recognition, vol. 141, 2023.
“Faceswap”, Available online: https://faceswap.dev, (accessed 18 June 2025).
J. Thies, M. Zollhofer, M. Stamminger, C. Theobalt, and M. Nießner, “Face2Face: Real-time face capture and reenactment of rgb videos,” Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2387-2395.
S. Banerjee, S. Yadav, A. Dhara, M. Ajij, “A survey: deepfake and current technologies for solutions,“CEUR Workshop Proceedings, vol. 3900, 2024.
A. Dosovitskiy, et. al., “An image is worth 16x16 words: transformers for image recognition at scale,” International Conference on Learning Representations, 2021.
A. Mehta, B. McArthur, N. Kolloju, and Z. Tu, “HFMF: Hierarchical fusion meets multi-stream models for deepfake detection,” Proc. of 2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2025, pp. 676-685.
K. Patil, S. Kale, J. Dhokey, A. Gulhane, “Deepfake detection using biological features: A survey,” arXiv preprint arXiv:2301.05819, 2023.
Hu X., “A comprehensive evaluation of deepfake detection methods: approaches, challenges and future prospects,” ITM Web of Conferences, EDP Sciences, vol. 73, 2025.
J. Cheng, et. al. “ED4: Explicit data-level debiasing for deepfake detection,” arXiv preprint arXiv:2408.06779. 2024.
D. King, “Dlib-ml: A machine learning toolkit,” Journal of Machine Learning Research, vol. 10, 2009, pp. 1755-1758.
OpenCV, https://opencv.org.
F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1251-1258.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770-778.
Luo, X.; Wang, Y. Frequency-Domain Masking and Spatial Interaction for Generalizable Deepfake Detection. Electronics 2025, 14
L. Lin, S. Santosh, M. Wu, X. Wang, S. Hu, “AI-face: A million-scale demographically annotated AI-generated face dataset and fairness benchmark,” Proc. of the Computer Vision and Pattern Recognition Conference, 2025, pp. 3503-3515.
N. Chandra, et. al., “Deepfake-eval-2024: A multi-modal in-the-wild benchmark of deepfakes circulated in 2024,”. arXiv preprint arXiv:2503.02857, 2025.
A. Rössler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner, “FaceForensics++: learning to detect manipulated facial images,” Proc. of the IEEE/CVF International Conference on Computer Vision, 2019.
B. Cavia, E. Horwitz, T. Reiss, Y. Hoshen, “Real-time deepfake detection in the real-world,” arXiv preprint arXiv:2406.09398. 2024.
“AI vs Deepfake vs Real”, Available online: https://huggingface.co/datasets/prithivMLmods/AI-vs-Deepfake-vs-Real (accessed 18 June 2025).
“ThisPersonDoesNotExist”, Available online: https://thispersondoesnotexist.com (accessed 18 June 2025).

Issue

2025 13th International Scientific Conference on Computer Science, COMSCI 2025 - Proceedings, 2025, Albania, https://doi.org/10.1109/COMSCI67172.2025.11225271

Вид: публикация в международен форум, публикация в реферирано издание, индексирана в Scopus

Е-Публикации
Технически университет - София

Детайли за публикация от базата данни на ТУ - София (Publication Details)