Instructions to use HuggingFaceTB/SmolLM2-360M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HuggingFaceTB/SmolLM2-360M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="HuggingFaceTB/SmolLM2-360M")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-360M") model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-360M") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use HuggingFaceTB/SmolLM2-360M with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "HuggingFaceTB/SmolLM2-360M" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceTB/SmolLM2-360M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/HuggingFaceTB/SmolLM2-360M
- SGLang
How to use HuggingFaceTB/SmolLM2-360M with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "HuggingFaceTB/SmolLM2-360M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceTB/SmolLM2-360M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "HuggingFaceTB/SmolLM2-360M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceTB/SmolLM2-360M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use HuggingFaceTB/SmolLM2-360M with Docker Model Runner:
docker model run hf.co/HuggingFaceTB/SmolLM2-360M
Reproducing Evaluation with lighteval
Hey!
For reproducibilities sake can you verify if this is the right configuration for evaluating with lighteval?
helm|hellaswag|0|0
lighteval|arc:easy|0|0
leaderboard|arc:challenge|0|0
helm|mmlu|0|0
helm|piqa|0|0
helm|commonsenseqa|0|0
lighteval|triviaqa|0|0
leaderboard|winogrande|0|0
lighteval|openbookqa|0|0
leaderboard|gsm8k|5|0
Furthermore:
- Did you manually calculate average over the accuracy for
easyandchallengeforARC? - What metrics did you report? Is it all accuracy?
Greetings,
Patrick
It also seems that some numbers might be wrong:
Winogrande 52.5 -> 54.62
GSM8K 3.2 -> 0.32
PIQA 71.3 -> 3.1 (em), 9.0 (qem), 3.8 (pem), 19.79 (pqem)
ETC.
Some numbers seems to be wildly different from what you reported....
| Task |Version| Metric |Value | |Stderr|
|-----------------------------------------------|------:|--------|-----:|---|-----:|
|all | |em |0.1994|± |0.0285|
| | |qem |0.2052|± |0.0282|
| | |pem |0.2423|± |0.0308|
| | |pqem |0.4098|± |0.0352|
| | |acc |0.4650|± |0.0142|
| | |acc_norm|0.4796|± |0.0151|
|helm:commonsenseqa:0 | 0|em |0.1949|± |0.0113|
| | |qem |0.1974|± |0.0114|
| | |pem |0.1949|± |0.0113|
| | |pqem |0.3129|± |0.0133|
|helm:hellaswag:0 | 0|em |0.2173|± |0.0041|
| | |qem |0.2404|± |0.0043|
| | |pem |0.2297|± |0.0042|
| | |pqem |0.3162|± |0.0046|
|helm:mmlu:_average:0 | |em |0.2021|± |0.0297|
| | |qem |0.2109|± |0.0303|
| | |pem |0.2469|± |0.0321|
| | |pqem |0.4168|± |0.0366|
--- MMLU subs ---
|helm:piqa:0 | 0|em |0.0311|± |0.0025|
| | |qem |0.0904|± |0.0041|
| | |pem |0.0386|± |0.0027|
| | |pqem |0.1979|± |0.0057|
|leaderboard:arc:challenge:0 | 0|acc |0.3660|± |0.0141|
| | |acc_norm|0.3848|± |0.0142|
|leaderboard:gsm8k:5 | 0|qem |0.0030|± |0.0015|
|leaderboard:winogrande:0 | 0|acc |0.5462|± |0.0140|
|lighteval:arc:easy:0 | 0|acc |0.7016|± |0.0094|
| | |acc_norm|0.6801|± |0.0096|
|lighteval:openbookqa:0 | 0|acc |0.2460|± |0.0193|
| | |acc_norm|0.3740|± |0.0217|
|lighteval:triviaqa:0 | 0|qem |0.1699|± |0.0028|
Hi, you can find the evaluation details: https://github.com/huggingface/smollm/blob/main/evaluation/README.md (currently missing MMLU cloze, we'll add it soon)
Here's an updated link https://github.com/huggingface/smollm/tree/main/text/evaluation#smollm2-base-models this gives the scores in the model card