544 7 months ago

s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.

tools

7 months ago

f63fdbd772dd · 20GB

qwen2
·
32.8B
·
Q4_K_M
{{- if .Suffix }}<|fim_prefix|>{{ .Prompt }}<|fim_suffix|>{{ .Suffix }}<|fim_middle|> {{- else if .M

Readme