robit/ qwen3.5-9b-r5-research:q4km

62 Downloads Updated 3 months ago

Fine-tuned Qwen3.5-9B with distilled reasoning from research-backed datasets. R5 was the first round to use production-quality data sources (Bespoke-Stratos, Tulu-3, SlimOrca) and achieved 84.2% on diverse eval — surpassing the base model.

tools thinking

ollama run robit/qwen3.5-9b-r5-research:q4km

curl http://localhost:11434/api/chat \
  -d '{
    "model": "robit/qwen3.5-9b-r5-research:q4km",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='robit/qwen3.5-9b-r5-research:q4km',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'robit/qwen3.5-9b-r5-research:q4km',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 3 months ago

3 months ago

1df19b2a88ed · 5.6GB ·

model

archqwen35

·

parameters8.95B

·

quantizationQ4_K_M

5.6GB

params

{ "num_ctx": 131072, "stop": [ "<|im_end|>" ], "temperature": 0.6, "top_

82B

Readme

Qwen3.5-9B R5 Research (Q4_K_M)

Fine-tuned Qwen3.5-9B with distilled reasoning from research-backed datasets. R5 was the first round to use production-quality data sources (Bespoke-Stratos, Tulu-3, SlimOrca) and achieved 84.2% on diverse eval — surpassing the base model. Superseded by R7 (86.8%).

Capabilities

Thinking — produces structured reasoning in <think> blocks
Tool calling — structured tool_calls via Ollama /api/chat
Instruction following — concise answers, format constraints, system prompt adherence

Eval Results

Benchmark	Score
Diverse stochastic eval (38 tests)	84.2%
Base qwen3.5:9b on same eval	79.0%

Training

Base model: Qwen/Qwen3.5-9B
Method: LoRA SFT (r=32, alpha=64, LR=1e-4, 1 epoch)
Data: 4122 samples from:
- bespokelabs/Bespoke-Stratos-17k — DeepSeek-R1 reasoning traces
- allenai/tulu-3-sft-mixture — instruction diversity
- Open-Orca/SlimOrca — curated GPT-4 instructions
Training suite: robit-man/fine_tuning_suite

Quickstart

ollama run robit/qwen3.5-9b-r5-research:q4km

Parameters

RENDERER qwen3.5 + PARSER qwen3.5
temperature 0.6, top_p 0.95
stop "<|im_end|>"

Note

R5 is superseded by robit/qwen3.5-9b-r7-research:q4km which adds PrimeIntellect data and scores 86.8%.

License

Derived from Qwen3.5-9B (Apache 2.0). Training data licenses vary by source.

![r5_research_nutrition_label.png](/assets/robit/qwen3.5-9b-r5-research/e2b055d5-5ede-42e2-b80d-909ed3692927)

---

# Qwen3.5-9B R5 Research (Q4_K_M)

Fine-tuned Qwen3.5-9B with distilled reasoning from research-backed datasets. R5 was the first round to use production-quality data sources (Bespoke-Stratos, Tulu-3, SlimOrca) and achieved 84.2% on diverse eval — surpassing the base model. Superseded by R7 (86.8%).

## Capabilities

- **Thinking** — produces structured reasoning in `<think>` blocks
- **Tool calling** — structured `tool_calls` via Ollama `/api/chat`
- **Instruction following** — concise answers, format constraints, system prompt adherence

## Eval Results

| Benchmark | Score |
|-----------|-------|
| Diverse stochastic eval (38 tests) | **84.2%** |
| Base qwen3.5:9b on same eval | 79.0% |

## Training

- **Base model**: [Qwen/Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B)
- **Method**: LoRA SFT (r=32, alpha=64, LR=1e-4, 1 epoch)
- **Data**: 4122 samples from:
  - [bespokelabs/Bespoke-Stratos-17k](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k) — DeepSeek-R1 reasoning traces
  - [allenai/tulu-3-sft-mixture](https://huggingface.co/datasets/allenai/tulu-3-sft-mixture) — instruction diversity
  - [Open-Orca/SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca) — curated GPT-4 instructions
- **Training suite**: [robit-man/fine_tuning_suite](https://github.com/robit-man/fine_tuning_suite)

## Quickstart

```bash
ollama run robit/qwen3.5-9b-r5-research:q4km
```

## Parameters

- `RENDERER qwen3.5` + `PARSER qwen3.5`
- `temperature 0.6`, `top_p 0.95`
- `stop "<|im_end|>"`

## Note

R5 is superseded by [robit/qwen3.5-9b-r7-research:q4km](https://ollama.com/robit/qwen3.5-9b-r7-research:q4km) which adds PrimeIntellect data and scores 86.8%.

## License

Derived from Qwen3.5-9B (Apache 2.0). Training data licenses vary by source.

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)