robit/ qwen3.5-9b-r7-research:q4km

127 Downloads Updated 3 months ago

Fine-tuned Qwen3.5-9B with distilled reasoning from research-backed datasets. Trained via LoRA SFT with an additive data strategy that preserves base model capabilities while improving instruction following and reasoning.

tools thinking

ollama run robit/qwen3.5-9b-r7-research:q4km

curl http://localhost:11434/api/chat \
  -d '{
    "model": "robit/qwen3.5-9b-r7-research:q4km",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='robit/qwen3.5-9b-r7-research:q4km',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'robit/qwen3.5-9b-r7-research:q4km',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 3 months ago

3 months ago

e861d95d6639 · 5.6GB ·

model

archqwen35

·

parameters8.95B

·

quantizationQ4_K_M

5.6GB

params

{ "stop": [ "<|im_end|>" ], "temperature": 0.6, "top_p": 0.95 }

65B

Readme

Qwen3.5-9B R7 Research (Q4_K_M)

Fine-tuned Qwen3.5-9B with distilled reasoning from research-backed datasets. Trained via LoRA SFT with an additive data strategy that preserves base model capabilities while improving instruction following and reasoning.

Capabilities

Thinking — produces structured reasoning in <think> blocks
Tool calling — structured tool_calls via Ollama /api/chat with tools parameter
Instruction following — concise answers, format constraints, system prompt adherence

Eval Results

Benchmark	Score
Diverse stochastic eval (38 tests, 9 categories)	86.8%
Base qwen3.5:9b on same eval	79.0%

Training

Base model: Qwen/Qwen3.5-9B
Method: LoRA SFT (r=32, alpha=64, LR=1e-4, 1 epoch)
Data: Additive mix of 4043 samples from:
- bespokelabs/Bespoke-Stratos-17k — DeepSeek-R1 reasoning traces
- allenai/tulu-3-sft-mixture — instruction diversity
- Open-Orca/SlimOrca — curated GPT-4 instructions
- PrimeIntellect/SYNTHETIC-1-SFT-Data — verified math/code/STEM
Training suite: robit-man/fine_tuning_suite

Quickstart

ollama run robit/qwen3.5-9b-r7-research:q4km

Parameters

RENDERER qwen3.5 + PARSER qwen3.5 (enables tool calling)
temperature 0.6, top_p 0.95
stop "<|im_end|>"

License

Derived from Qwen3.5-9B (Apache 2.0). Training data licenses vary by source.

![r7_research_nutrition_label.png](/assets/robit/qwen3.5-9b-r7-research/3a2e472f-60e6-4df3-83bb-dd6c258727ae)

---

# Qwen3.5-9B R7 Research (Q4_K_M)

Fine-tuned Qwen3.5-9B with distilled reasoning from research-backed datasets. Trained via LoRA SFT with an additive data strategy that preserves base model capabilities while improving instruction following and reasoning.

## Capabilities

- **Thinking** — produces structured reasoning in `<think>` blocks
- **Tool calling** — structured `tool_calls` via Ollama `/api/chat` with `tools` parameter
- **Instruction following** — concise answers, format constraints, system prompt adherence

## Eval Results

| Benchmark | Score |
|-----------|-------|
| Diverse stochastic eval (38 tests, 9 categories) | **86.8%** |
| Base qwen3.5:9b on same eval | 79.0% |

## Training

- **Base model**: [Qwen/Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B)
- **Method**: LoRA SFT (r=32, alpha=64, LR=1e-4, 1 epoch)
- **Data**: Additive mix of 4043 samples from:
  - [bespokelabs/Bespoke-Stratos-17k](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k) — DeepSeek-R1 reasoning traces
  - [allenai/tulu-3-sft-mixture](https://huggingface.co/datasets/allenai/tulu-3-sft-mixture) — instruction diversity
  - [Open-Orca/SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca) — curated GPT-4 instructions
  - [PrimeIntellect/SYNTHETIC-1-SFT-Data](https://huggingface.co/datasets/PrimeIntellect/SYNTHETIC-1-SFT-Data) — verified math/code/STEM
- **Training suite**: [robit-man/fine_tuning_suite](https://github.com/robit-man/fine_tuning_suite)

## Quickstart

```bash
ollama run robit/qwen3.5-9b-r7-research:q4km
```

## Parameters

- `RENDERER qwen3.5` + `PARSER qwen3.5` (enables tool calling)
- `temperature 0.6`, `top_p 0.95`
- `stop "<|im_end|>"`

## License

Derived from Qwen3.5-9B (Apache 2.0). Training data licenses vary by source.

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)