3 yesterday

Qwen3-1.7b Fully sft on MaggiePie 300k filtered, then lora adapter merged. Various quants available.

ollama run treyrowell1826/qwen3-pinion:q4_k_m

Details

yesterday

6528bed0ccb2 · 1.3GB ·

qwen3
·
2.03B
·
Q4_K_M
{{- if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{- range .Messages }}{{- if eq
{ "num_ctx": 4096, "num_predict": 2048, "repeat_penalty": 1.05, "stop": [ "<
You are Pinion — a reasoning-first AI assistant. ## Identity - Architecture: Qwen3 1.7B Transforme

Readme

qwen3-pinion

Ollama runtime distribution of qwen3-pinion, a merged Qwen3 1.7B SFT checkpoint.

This package is the runtime downstream of the canonical full-weights and GGUF artifact releases and is intended for local inference through Ollama.

Canonical full weights: https://huggingface.co/Somnus-Sovereign-Systems/qwen3-pinion
Full-weights DOI: 10.57967/hf/7965 (https://doi.org/10.57967/hf/7965)

Canonical GGUF artifacts: https://huggingface.co/Somnus-Sovereign-Systems/qwen3-pinion-gguf
GGUF DOI: 10.57967/hf/7966 (https://doi.org/10.57967/hf/7966)

[!WARNING] SFT-only release with reduced safety alignment versus strongly post-aligned systems. Treat outputs as untrusted and do not use this model for harmful, high-risk, or safety-critical workflows.

Runtime Lineage

This Ollama package is a downstream runtime distribution of the following artifact chain:

  • Canonical model artifact: full weights (Somnus-Sovereign-Systems/qwen3-pinion)
  • Canonical GGUF source artifact: qwen3-pinion-f16.gguf
  • Quantized GGUF derivatives: Q8_0, Q5_K_M, Q4_K_M
  • Ollama runtime tags: packaged from those GGUF artifacts for direct local inference

Published Tags

Published tags:

  • treyrowell1826/qwen3-pinion:f16
  • treyrowell1826/qwen3-pinion:q8_0
  • treyrowell1826/qwen3-pinion:q5_k_m
  • treyrowell1826/qwen3-pinion:q4_k_m

Quick use:

”`bash ollama run treyrowell1826/qwen3-pinion:q5_k_m

Artifact Hierarchy

f16 is the canonical GGUF source artifact used for this runtime distribution

q8_0, q5_k_m, and q4_k_m are downstream quantized runtime variants of that f16 source

Context and Template

This model uses ChatML-style turns and retains the exported chat template lineage from the base artifact chain.

Stop sequences:

<|im_end|>

<|im_start|>

<|endoftext|>

This model family supports a 40,960-token context window.

Provenance

Artifact pipeline:

  1. LoRA SFT (rlhf.py)

  2. Merge into standalone checkpoint (merge_sft_lora.py)

  3. Export merged weights to canonical GGUF f16 (export_gguf.py)

  4. Generate downstream GGUF quantized variants from the f16 source

  5. Package those GGUF artifacts into Ollama runtime tags

Upstream code provenance DOI: 10.5281/zenodo.18607464 URL: https://doi.org/10.5281/zenodo.18607464

Licensing Boundary

This package distributes downstream runtime model artifacts only.

The underlying model artifacts follow the licensing boundaries defined by the upstream artifact repositories:

Full weights: Somnus-Sovereign-Systems/qwen3-pinion

GGUF artifacts: Somnus-Sovereign-Systems/qwen3-pinion-gguf

The training, merge, export, and pipeline code used to produce these artifacts is licensed separately under GNU GPL v3.0 (GPLv3) in its respective code repository:

https://github.com/calisweetleaf/Reinforcement-Learning-Full-Pipeline/blob/main/LICENSE

Users must also comply with the applicable upstream terms for:

Somnus-Sovereign-Systems/qwen3-pinion

Qwen/Qwen3-1.7B

Magpie-Align/Magpie-Pro-300K-Filtered

Citation

@misc{rowell2026_qwen3_pinion_ollama, author = {Rowell, Christian Trey Levi}, title = {qwen3-pinion}, year = {2026}, url = {https://ollama.com/treyrowell1826/qwen3-pinion}, note = {Ollama runtime distribution of the qwen3-pinion model family}, publisher = {Ollama} }