s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.

tools

525 2 months ago

3 Tags
f63fdbd772dd • 20GB • 2 months ago
f03012c03b00 • 10GB • 2 months ago
f63fdbd772dd • 20GB • 2 months ago