Notes
Uploaded in fp16 (full‑precision) and Q8_0 formats. Q8_0 is the default – it strikes the perfect balance for CPU inference while preserving nearly all of the fp16 quality.
Temperature: The model was tuned for deterministic math reasoning. Start with 0.6 (or 1.0 for more exploratory code generation). Lower values (≈0.2) can be used for very short fact‑lookup prompts.
Description
VibeThinker‑1.5B – a 1.5 B‑parameter dense model built on top of Qwen2.5‑Math‑1.5B.
- Core innovation: Spectrum‑to‑Signal Principle (SSP) – a two‑stage training pipeline that first maximizes solution diversity during SFT, then reinforces correct signals with RL. This diversity‑first approach lets a tiny model punch far above its parameter count.
- Specialty: Competitive‑style math (AIME, HMMT) and algorithmic coding (LeetCode, Codeforces). The model performs best when questions are asked in English.
- Architecture: Standard decoder‑only transformer stack (dense) with a focus on efficient attention; no MoE or exotic layers.
- Use‑case focus:
- Hard math problem solving
- Algorithmic code generation / reasoning
- Structured JSON output for tool‑calling (if needed)
Not recommended for general‑purpose chat, summarization, or creative writing.
References
VibeThinker GitHub
ModelScope page
Technical Report (arXiv 2511.06221)
Model Card on HuggingFace