Our first full preview version, getting us out of the research preview state. Brings the lowest hallucination rates, SOTA performance across different evals and an efficient architecture.

aquif-3.0 Cosmos 🌠

aquif-3.0 Cosmos represents a significant leap forward in LLM capabilities, building upon the advancements of the aquif-3.0 line. This model boasts increased scale and performance, pushing the boundaries of what’s possible in reasoning and accuracy.

Model Overview

Name: aquif-3.0-cosmos
Parameters: 8.2 Billion
Architecture: Decoder-only transformer
Type: General-purpose LLM
Key Features:
- Enhanced reasoning through a new iteration of “thinking mode”.
- Strong performance across diverse benchmarks.
- Intended for a wide range of applications.

Features

8.2 Billion Parameters: Increased model size for enhanced capacity and performance.
Thinking Mode: Like Preview 2, Cosmos includes an enhanced “thinking mode” for improved reasoning.
Strong Performance: Demonstrates excellent results in evaluations, particularly in complex reasoning tasks.

Performance Benchmarks

aquif-3.0 Cosmos shows very promising results, indicating a substantial improvement in capabilities:

	aquif-3.0-preview-1	aquif-3.0-preview-2	aquif-3.0-cosmos	aquif-3.0-cosmos (thinking)
HumanEval	69.4	80.5	89.7	92.1
MATH-500	35.5	58.0	69.0	73.3
Hallucination Rate	16.5	14.2	7.6	8.2

Disclaimer: These results are based on internal evaluations and are approximate. Final, official benchmarks will be released with the model’s full release.

Using Thinking Mode

To activate the enhanced reasoning capabilities of “thinking mode,” send the following control message before your prompt:

{
  "role": "control",
  "content": "thinking"
}