s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.

tools 32b

227 4 weeks ago

9 Tags
759a1a4e8735 • 20GB • 4 weeks ago
759a1a4e8735 • 20GB • 4 weeks ago
2d7e32e4f86f • 66GB • 4 weeks ago
cd94e711013f • 12GB • 4 weeks ago
2e931764577d • 16GB • 4 weeks ago
759a1a4e8735 • 20GB • 4 weeks ago
a75e926fb690 • 23GB • 4 weeks ago
dbeaf3c09cfd • 27GB • 4 weeks ago
fedf3b0567df • 35GB • 4 weeks ago