Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Datasets filters
Main
Tasks
Libraries
Languages
Licenses
Other
Modalities
3D
Audio
Document
Geospatial
Image
Tabular
Text
Time-series
Video
Size (rows)
Reset Size
< 1K
> 1T
Format
json
csv
parquet
optimized-parquet
imagefolder
soundfolder
webdataset
text
arrow
Evaluation
Reset Evaluation
Benchmark
Apply filters
Datasets
12
Full-text search
Edit filters
Sort: Trending
Active filters:
official
Clear all
cais/hle
Benchmark
•
Updated
Jan 20
•
2.5k
•
43.1k
•
736
openai/gsm8k
Benchmark
•
Updated
Dec 20, 2025
•
17.6k
•
511k
•
1.18k
TIGER-Lab/MMLU-Pro
Benchmark
•
Updated
Jan 19
•
12.1k
•
97.1k
•
442
allenai/olmOCR-bench
Benchmark
•
Updated
15 days ago
•
2.9k
•
113
SWE-bench/SWE-bench_Verified
Benchmark
•
Updated
7 days ago
•
500
•
98.7k
•
16
Idavidrein/gpqa
Benchmark
•
Updated
about 18 hours ago
•
1.25k
•
90.7k
•
373
ScaleAI/SWE-bench_Pro
Benchmark
•
Updated
11 days ago
•
731
•
62.6k
•
54
mteb/arguana
Benchmark
•
Updated
12 days ago
•
11.5k
•
7.77k
•
3
MathArena/aime_2026
Benchmark
•
Updated
18 days ago
•
30
•
3.91k
•
20
FutureMa/EvasionBench
Benchmark
•
Updated
15 days ago
•
16.7k
•
577
•
117
harborframework/terminal-bench-2.0
Benchmark
•
Updated
17 days ago
•
1.52k
•
5
MathArena/hmmt_feb_2026
Benchmark
•
Updated
15 days ago
•
33
•
366