68 Downloads Updated 9 months ago
Length Control for Reasoning Language Models with just a Prompt!
L1-Qwen-1.5B-Max requires output to be no longer than the target length, allowing flexibility while respecting upper bounds. The model is converted from l3lab/L1-Qwen-1.5B-Max.
ollama pull devkit/L1-Qwen-1.5B-MaxUser: A is B’s father, C is D’s mother, and D and A are brothers. What is the relationship between B and C? Think for maximum 1024 tokens.
For more info about this model, refer to this blog.
@misc{aggarwal2025l1controllinglongreasoning,
title={L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning},
author={Pranjal Aggarwal and Sean Welleck},
year={2025},
eprint={2503.04697},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.04697},
}