27 4 months ago

Our first full preview version, getting us out of the research preview state. Brings the lowest hallucination rates, SOTA performance across different evals and an efficient architecture.

tools

Models

View all →

Readme

aquif-3.0 Cosmos 🌠

aquif-3.0 Cosmos represents a significant leap forward in LLM capabilities, building upon the advancements of the aquif-3.0 line. This model boasts increased scale and performance, pushing the boundaries of what’s possible in reasoning and accuracy.

Model Overview

  • Name: aquif-3.0-cosmos
  • Parameters: 8.2 Billion
  • Architecture: Decoder-only transformer
  • Type: General-purpose LLM
  • Key Features:
    • Enhanced reasoning through a new iteration of “thinking mode”.
    • Strong performance across diverse benchmarks.
    • Intended for a wide range of applications.

Features

  • 8.2 Billion Parameters: Increased model size for enhanced capacity and performance.
  • Thinking Mode: Like Preview 2, Cosmos includes an enhanced “thinking mode” for improved reasoning.
  • Strong Performance: Demonstrates excellent results in evaluations, particularly in complex reasoning tasks.

Performance Benchmarks

aquif-3.0 Cosmos shows very promising results, indicating a substantial improvement in capabilities:

aquif-3.0-preview-1 aquif-3.0-preview-2 aquif-3.0-cosmos aquif-3.0-cosmos (thinking)
HumanEval 69.4 80.5 89.7 92.1
MATH-500 35.5 58.0 69.0 73.3
Hallucination Rate 16.5 14.2 7.6 8.2

Disclaimer: These results are based on internal evaluations and are approximate. Final, official benchmarks will be released with the model’s full release.

Using Thinking Mode

To activate the enhanced reasoning capabilities of “thinking mode,” send the following control message before your prompt:

{
  "role": "control",
  "content": "thinking"
}