27 Downloads Updated 4 months ago
aquif-3.0 Cosmos represents a significant leap forward in LLM capabilities, building upon the advancements of the aquif-3.0 line. This model boasts increased scale and performance, pushing the boundaries of what’s possible in reasoning and accuracy.
aquif-3.0 Cosmos shows very promising results, indicating a substantial improvement in capabilities:
aquif-3.0-preview-1 | aquif-3.0-preview-2 | aquif-3.0-cosmos | aquif-3.0-cosmos (thinking) | |
---|---|---|---|---|
HumanEval | 69.4 | 80.5 | 89.7 | 92.1 |
MATH-500 | 35.5 | 58.0 | 69.0 | 73.3 |
Hallucination Rate | 16.5 | 14.2 | 7.6 | 8.2 |
Disclaimer: These results are based on internal evaluations and are approximate. Final, official benchmarks will be released with the model’s full release.
To activate the enhanced reasoning capabilities of “thinking mode,” send the following control message before your prompt:
{
"role": "control",
"content": "thinking"
}