ServiceNow-AI/Apriel-1.6-15b-Thinker

ServiceNow-AI/ Apriel-1.6-15b-Thinker

3,384 Downloads Updated 7 months ago

vision tools thinking

ollama run ServiceNow-AI/Apriel-1.6-15b-Thinker:Q4_K_M

curl http://localhost:11434/api/chat \
  -d '{
    "model": "ServiceNow-AI/Apriel-1.6-15b-Thinker:Q4_K_M",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='ServiceNow-AI/Apriel-1.6-15b-Thinker:Q4_K_M',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'ServiceNow-AI/Apriel-1.6-15b-Thinker:Q4_K_M',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Applications

Claude Code

Claude Code ollama launch claude --model ServiceNow-AI/Apriel-1.6-15b-Thinker:Q4_K_M

OpenCode

OpenCode ollama launch opencode --model ServiceNow-AI/Apriel-1.6-15b-Thinker:Q4_K_M

Hermes Agent

Hermes Agent ollama launch hermes --model ServiceNow-AI/Apriel-1.6-15b-Thinker:Q4_K_M

OpenClaw

OpenClaw ollama launch openclaw --model ServiceNow-AI/Apriel-1.6-15b-Thinker:Q4_K_M

Models

Name

1 model

Size / Usage

Context

Input

Apriel-1.6-15b-Thinker:Q4_K_M

9.7GB · 256K context window · Text, Image · 7 months ago

Apriel-1.6-15b-Thinker:Q4_K_M

9.7GB

256K

Text, Image

Readme

Summary

Apriel-1.6-15B-Thinker is an updated multimodal reasoning model in ServiceNow’s Apriel SLM series, building on Apriel-1.5-15B-Thinker. With significantly improved text and image reasoning capabilities, Apriel-1.6 achieves competitive performance against models up to 10x its size. Like its predecessor, it benefits from extensive continual pre-training across both text and image domains. We additionally perform post-training that focuses on Supervised Finetuning (SFT) and Reinforcement Learning (RL). Apriel-1.6 obtains frontier performance without sacrificing reasoning token efficiency. The model improves or maintains task performance when compared with Apriel-1.5-15B-Thinker, while reducing reasoning token usage by more than 30%.

Highlights

Achieves a score of 57 on the Artificial Analysis index outperforming models like Gemini 2.5 Flash, Claude Haiku 4.5 and GPT OSS 20b. It obtains a score on par with Qwen3 235B A22B, while being significantly more efficient.
Reduces reasoning token usage by more than 30%, delivering significantly better efficiency than Apriel-1.5-15B-Thinker.
Scores 69 on Tau2 Bench Telecom and 69 on IFBench, which are key benchmarks for the enterprise domain.
At 15B parameters, the model fits on a single GPU, making it highly memory-efficient.
Based on community feedback on Apriel-1.5-15B-Thinker, we simplified the chat template by removing redundant tags and introduced four special tokens to the tokenizer (<tool_calls>, </tool_calls>, [BEGIN FINAL RESPONSE], <|end|>) for easier output parsing.

Please see our blog post for more details.

**Summary**

Apriel-1.6-15B-Thinker is an updated multimodal reasoning model in ServiceNow's Apriel SLM series, building on [Apriel-1.5-15B-Thinker](https://huggingface.co/ServiceNow-AI/Apriel-1.5-15b-Thinker). With significantly improved text and image reasoning capabilities, Apriel-1.6 achieves competitive performance against models up to 10x its size. Like its predecessor, it benefits from extensive continual pre-training across both text and image domains. We additionally perform post-training that focuses on Supervised Finetuning (SFT) and Reinforcement Learning (RL). Apriel-1.6 obtains frontier performance without sacrificing reasoning token efficiency. The model improves or maintains task performance when compared with Apriel-1.5-15B-Thinker, while reducing reasoning token usage by more than 30%.

**Highlights**

- Achieves a score of 57 on the Artificial Analysis index outperforming models like Gemini 2.5 Flash, Claude Haiku 4.5 and GPT OSS 20b. It obtains a score on par with Qwen3 235B A22B, while being significantly more efficient.
- Reduces reasoning token usage by more than 30%, delivering significantly better efficiency than Apriel-1.5-15B-Thinker.
- Scores 69 on Tau2 Bench Telecom and 69 on IFBench, which are key benchmarks for the enterprise domain.
- At 15B parameters, the model fits on a single GPU, making it highly memory-efficient.
- Based on community feedback on Apriel-1.5-15B-Thinker, we simplified the chat template by removing redundant tags and introduced four special tokens to the tokenizer (`<tool_calls>`, `</tool_calls>`, `[BEGIN FINAL RESPONSE]`, `<|end|>`) for easier output parsing.

Please see our [blog post](https://huggingface.co/blog/ServiceNow-AI/apriel-1p6-15b-thinker) for more details.

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)