s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.
tools
32b
227 Pulls Updated 4 weeks ago
66b9ea09bd5b · 68B
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.