DFBench: The Image Deepfake Detection Benchmark 2025
DFBench provides a standardized evaluation for computer vision deepfake detection systems. This leaderboard focuses on image deepfake detection, e.g. the output of text-to-image and image-to-image models.
Objectives:
- Allow fair comparison between deepfake detection models on unseen test data (no fine tuning on the test data possible)
- Advance the state-of-the-art in synthetic media identification
Leaderboard Image Deepfake Detection
Rank | Model | Accuracy | Accuracy on Real | Accuracy on Fake | Accuracy on JPEG | Accuracy on PNG | Accuracy on WEBP | Accuracy on TIFF |
---|---|---|---|---|---|---|---|---|
1 | RECCE | 67.3 | 99.4 | 35.1 | 64.2 | 69.5 | 68.8 | 69.8 |
2 | Xception | 66.1 | 99.3 | 33.0 | 63.8 | 67.4 | 69.0 | 66.7 |
3 | ResNet101 | 65.5 | 97.7 | 33.4 | 63.1 | 67.2 | 66.7 | 67.6 |
4 | Xception SLADD | 65.0 | 99.9 | 30.1 | 62.5 | 65.6 | 67.2 | 67.4 |
5 | STIL | 64.7 | 98.3 | 31.2 | 61.4 | 67.4 | 67.7 | 65.8 |
6 | ResNet34 | 64.0 | 98.4 | 29.6 | 61.8 | 65.8 | 65.0 | 65.6 |
7 | VGG19 | 60.7 | 99.4 | 21.9 | 57.5 | 61.8 | 64.0 | 62.8 |
8 | EfficientNetB4 | 58.2 | 99.7 | 16.8 | 55.5 | 60.6 | 61.1 | 58.4 |
9 | CLIP | 55.4 | 94.0 | 16.8 | 54.6 | 57.4 | 56.6 | 54.0 |
10 | Xception FFD | 54.8 | 97.3 | 12.3 | 53.7 | 56.4 | 56.3 | 54.1 |
The Leaderboard is updated upon validation of new submissions. All results are evaluated on the official test dataset.