30 Downloads Updated 2 days ago
ollama run jpacifico/chocolatine-2.1
Updated 2 days ago
2 days ago
889a0931d5ca · 2.5GB ·
Chocolatine-2.1 is an open-weight 4B dense instruct model optimized for French, delivering strong performance, natural dialogue, and robust multilingual usability for local and agent-oriented use.
Built through post-training with a strong focus on French, Chocolatine-2.1 is designed to deliver more natural and effective behavior in French while preserving strong multilingual usability. Despite the French-centered training pipeline, English performance remains robust and can even slightly improve over the base model, suggesting positive cross-lingual transfer.
A compact 4B instruct model derived from Qwen3-4B-Instruct-2507 and post-trained with DPO and model merging.
This is the most practical entry point in the Chocolatine-2 family for users who want a strong quality / speed / footprint trade-off, especially for local inference and lightweight agent loops.
Key characteristics.
- 4B dense instruct model.
- 262K native context.
- optimized for direct, efficient generation.
- strong French benchmark gains relative to the base model.
- robust English performance despite French-focused post-training
chocolatine-2.1:latest is the default recommended variant.
It currently maps to chocolatine-2.1:q4_k_m, which is the most accessible option across a wide range of hardware, including CPU-only setups.
Recommended default variant for most users.
ollama run jpacifico/chocolatine-2.1:q4_k_m
Higher-quality variant for users who can afford a larger memory footprint.
ollama run jpacifico/chocolatine-2.1:q8_0
Additional MLX variants are available in 4bit and 6bit on Hugging Face for Apple Silicon workflows. Ollama also supports MLX-based inference, making these variants relevant for local Apple Silicon use.
Chocolatine-2.1 places a strong emphasis on measured performance, not only stylistic adaptation.
The comparison is against its direct base model, Qwen3-4B-Instruct-2507.
Across the reported French benchmarks, Chocolatine-2.1 shows consistent improvements over the base model on a range of capability types. This points to a broad gain in French performance across the evaluated tasks.
| Benchmark | Base model | Chocolatine-2.1 4B | Delta |
|---|---|---|---|
| gpqa-fr:diamond | 28.93 | 32.49 | +3.56 |
| french_bench_arc_challenge | 47.13 | 49.79 | +2.66 |
| french_bench_grammar | 70.59 | 72.27 | +1.68 |
| xwinograd_fr | 66.27 | 67.47 | +1.20 |
| french_bench_boolqa | 88.76 | 89.89 | +1.13 |
| french_bench_hellaswag | 56.99 | 58.03 | +1.04 |
| global_mmlu_fr | 63.75 | 64.75 | +1.00 |
| fr_mt_bench | 6.22 | 6.44 | +0.22 |
FR-MT-Bench evaluation is performed on MT-Bench-French, using multilingual-mt-bench with OpenAI/GPT-5 as the LLM judge.
global_mmlu_fr, xwinograd_fr and french_bench results were obtained using EleutherAI LM Eval Harness in a 0-shot evaluation setting.
gpqa-fr:diamond using LightEval/vLLM via kurakurai/Luth process eval.
Chocolatine-2.1 is well suited for:
- French general assistants.
- local chat and private inference.
- RAG and document-grounded assistants.
- automation and agent-oriented workflows.
- Apple Silicon and local-first experimentation.
- compact deployments where latency and cost matter
Chocolatine-2.1 does not include a built-in moderation layer.
As with any open-weight instruct model, output quality depends on prompting, evaluation setup, context quality, and deployment safeguards. Additional care is recommended for sensitive or high-risk applications.
Apache-2.0
Developed by: Jonathan Pacifico, 2026.
Made with ❤️ in France