Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
CKeibel 's Collections
SLMs
PII
Code-Embeddings
Speech2Text (ASR)
Seq2Seq
Reward Models
diffusion models
Text-Classification
Data
PEFT (Papers)
LLMs (Papers)
Causal LMs, seq2seq models
Embedding models
Vision stuff
datasets
NER
BERT based tasks (models)
Multimodal

Data

updated Feb 13
Upvote
-

  • HuggingFaceFW/fineweb-2

    Viewer • Updated Oct 27, 2025 • 4.48B • 54.3k • 810

  • allenai/c4

    Viewer • Updated Jan 9, 2024 • 10.4B • 762k • 584

  • ServiceNow-AI/R1-Distill-SFT

    Viewer • Updated Feb 8, 2025 • 1.85M • 4.78k • 320

  • PrimeIntellect/INTELLECT-2-RL-Dataset

    Viewer • Updated May 13, 2025 • 285k • 240 • 66

  • togethercomputer/RedPajama-Data-V2

    Updated Nov 21, 2024 • 8.86k • 403

  • wikimedia/wikipedia

    Viewer • Updated Jan 9, 2024 • 61.6M • 241k • 1.23k

  • avemio/German-RAG-EMBEDDING-TRIPLES-HESSIAN-AI

    Viewer • Updated Oct 16, 2024 • 294k • 19 • 1

  • urchade/synthetic-pii-ner-mistral-v1

    Updated Apr 20, 2024 • 319 • 16

  • yahma/alpaca-cleaned

    Viewer • Updated Apr 10, 2023 • 51.8k • 29.8k • 833
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs