49 6 days ago

DASD-4B-Thinking is a compact yet capable 4B dense language model specialized in long chain-of-thought (Long-CoT) reasoning across mathematics, code generation, and scientific reasoning. This version has overfitting! Avoid it.

4b
c8472cd9daed · 31B
You are a helpful AI assistant.