Tags · reaperdoesntrun/Qwen3-0.6B-Distilled

reaperdoesntrun/ Qwen3-0.6B-Distilled

257 Downloads Updated 2 months ago

A 0.6B parameter model built in two stages: knowledge distillation from a 30B Thinking teacher to establish a structured reasoning backbone, then supervised fine-tuning on legal instruction data. 50x compression. Under 500MB quantized. Runs on a phone.

tools thinking

Name

1 model

Size / Usage

Context

Input

Qwen3-0.6B-Distilled:latest

f31e321d12f3 • 484MB • 40K context window • Text input • 2 months ago

Text input • 2 months ago

Qwen3-0.6B-Distilled:latest

484MB

40K

Text

f31e321d12f3 · 2 months ago