cogito:32b-v1-preview-qwen-q8_0

642.1K 6 months ago

Cogito v1 Preview is a family of hybrid reasoning models by Deep Cogito that outperform the best available open models of the same size, including counterparts from LLaMA, DeepSeek, and Qwen across most standard benchmarks.

tools 3b 8b 14b 32b 70b

6 months ago

39317c19a975 · 35GB ·

qwen2
·
32.8B
·
Q8_0
Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US
{{- if .Messages }} {{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{

Readme

The Cogito v1 Preview LLMs are instruction tuned generative models (text in/text out). All models are released under an open license for commercial use.

  • Cogito models are hybrid reasoning models. Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models).
  • The LLMs are trained using Iterated Distillation and Amplification (IDA) - an scalable and efficient alignment strategy for superintelligence using iterative self-improvement.
  • The models have been optimized for coding, STEM, instruction following and general helpfulness, and have significantly higher multilingual, coding and tool calling capabilities than size equivalent counterparts.
    • In both standard and reasoning modes, Cogito v1-preview models outperform their size equivalent counterparts on common industry benchmarks.
  • Each model is trained in over 30 languages and supports a context length of 128k.

Extended thinking

To enable extended thinking, include Enable deep thinking subroutine. in the system prompt:

/set system """Enable deep thinking subroutine."""

Or via the API:

curl http://localhost:11434/api/chat -d '{
  "model": "cogito",
  "messages": [
    {
      "role": "system",
      "content": "Enable deep thinking subroutine."
    },
    {
      "role": "user",
      "content": "How many letter Rs are in the word Strawberry?"
    }
  ]
}'

Sizes

3B

ollama run cogito:3b

8B

ollama run cogito:8b

14B

ollama run cogito:14b

32B

ollama run cogito:32b

70B

ollama run cogito:70b

Benchmarks

Smaller models - 3B and 8B

3B performance

3b.webp

8B performance

8b.webp

3B tool calling

3b-toolcalling.webp

Medium models - 14B and 32B

14B

14b.webp

32B

32b.webp

Larger models - 70B

70b.webp

References

Blog post

Hugging Face