Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
68 Pulls 3 Tags Updated 9 months ago