s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.
tools
32b
227 Pulls Updated 4 weeks ago
9 Tags
759a1a4e8735 • 20GB •
4 weeks ago
759a1a4e8735 • 20GB •
4 weeks ago
2d7e32e4f86f • 66GB •
4 weeks ago
cd94e711013f • 12GB •
4 weeks ago
2e931764577d • 16GB •
4 weeks ago
759a1a4e8735 • 20GB •
4 weeks ago
a75e926fb690 • 23GB •
4 weeks ago
dbeaf3c09cfd • 27GB •
4 weeks ago
fedf3b0567df • 35GB •
4 weeks ago