7,708 1 month ago

Qwopus3.5-v3 is a reasoning-enhanced model family fine-tune of Qwen3.5 designed to improve reasoning stability and correctness, optimize inference efficiency, and enhance cross-task generalization capabilities, especially in programming.

vision tools thinking 4b 9b 27b
ollama run fredrezones55/Qwopus3.5

Details

1 month ago

df5a8cff6e4a · 6.5GB ·

qwen35
·
9.41B
·
Q4_K_M
{ "presence_penalty": 1.5, "temperature": 1, "top_k": 20, "top_p": 0.95 }
{{ .Prompt }}

Readme

My internet is a tad slow, the models might take a bit to upload… Will upload other quants of the model later.

Jackrong mentions they are working on a 4B model. Currently as of [04/01/2026] there is only the 9B model. [04/02/2026] seems like finished fine-tuning the 4B model. [04/03/2026] Jackrong added a -v3 27B model 👏, my poor Internet router 🛜🤣

Patched to keep Qwen3.5’s full thinking + vision + tooling behavior in Ollama, no engine hacks required.

An attempt to create a direct port of the Qwopus3.5 -v3 GGUF format Finetuned family of optimized reasoning models with the Ollama engine. [the conversion was much faster as I did not have to debug my pipeline, same base model as Qwen3.5-Opus series]. Core changes made was to port the model to how the Ollama engine would want it. The finetune was created by: https://huggingface.co/Jackrong

Qwopus3.5-v3 is a reasoning-enhanced model based on Qwen3.5. Its core objective is to simultaneously improve reasoning stability and correctness while optimizing inference efficiency, ultimately achieving stronger cross-task generalization capabilities—particularly in programming.


Evaluation Summary

While the overall accuracy margin (+1.43 pp) is modest, Qwopus3.5-9B-v3 fundamentally shifts the accuracy-cost paradigm, achieving its victory while spending significantly less reasoning budget. With a 25.3% reduction in mean think length and 24.0% lower token cost per correct answer, this iteration is highly optimized for latency, token budget, and context pressure.

Furthermore, across the mixed domain profile, Qwopus3.5-9B-v3 uniquely offsets Qwen3.5-9B’s slight edge in biology, CS, and math by excelling in physics, chemistry, and significantly lowering its unfinished-output rate. Its final rank benefits as much from raw correctness as from an improved ability to cleanly and reliably complete analytical boundaries.

🗺️ Training Pipeline Overview

Base Model (Qwen3.5-9B)
 │
 ▼
Qwen3.5-9B fine-tuned with Unsloth
 │
 ▼
Supervised Fine-Tuning (SFT) + LoRA
(Response-Only Training masked on "<|im_start|>assistant\n<think>")
 │
 ▼
Qwopus3.5-9B-v3

🧠 Example of Learned Reasoning Scaffold

The model includes targeted optimizations addressing Qwen3.5’s tendency toward excessive or repetitive reasoning on simple queries. By distilling the structured reasoning habits of top-tier models like Claude Opus, Qwopus3.5-v3 adopts a highly organized, step-by-step cognitive layout.

Example:The user is asking about [Topic] and how it differs from [Topic B]. This is a [Task type] question. Let me break this down:
1. What is [Topic A]?
   - [Fact/Mechanism 1]
   - [Fact/Mechanism 2]
2. What is [Topic B]?
   - [Fact/Mechanism 1]
3. Key differences:
   - [Comparison Point 1]
   - [Comparison Point 2]
Let me make sure to be accurate: [...]
Actually, I should double-check: is [Fact] used before [Fact]? Yes, typically...
Let me provide a clear, well-structured answer:

📚 Training Data

The model was fine-tuned on a high-fidelity reasoning dataset, which was meticulously curated from a blend of premium open-source sources on Hugging Face. This dataset is the result of a rigorous mixing and cleaning process, specifically designed to filter out low-quality responses and ensure consistently strong logical performance across diverse analytical domains.

(Rest assured, the entire process is strictly by-the-book and 100% compliant with all terms and open-source licenses!)

⚠️ Limitations & Intended Use

  • Hallucination Risk: While reasoning is strong, the model remains an autoregressive LLM; external facts provided during the thinking sequence may occasionally contain hallucinations if verifying real-world events.
  • Intended Scenario: Best suited for offline analytical tasks, coding, math, and heavy logic-dependent prompting where the user needs to transparently follow the AI’s internal logic.
  • This model is a test version intended solely for learning and demonstration purposes, and is for academic research and technical exploration use only.