
-
Sky-T1-32B-Preview
32B reasoning model trained from Qwen2.5-32B-Instruct with 17K data with performance on par with o1 preview.
tools1,549 Pulls 2 Tags Updated 7 months ago
-
llama-3.2-3b-thinking
Llama 3.2-3b finetuned from Claude and o1 style thinking
266 Pulls 2 Tags Updated 11 months ago
-
llama-3.2-rltp-v2
Reinforcement Learning with Thought Process Llama 3.2 3B to achieve search
26 Pulls 1 Tag Updated 10 months ago
-
llama-3.2-rltp
Reinforcement Learning with Thought Process Llama 3.2 3B to achieve search
20 Pulls 1 Tag Updated 10 months ago