19 19 hours ago

Non-thinking qwen3.5:4b tuned for concise conversational chat with verbatim short code preservation.

vision tools
ollama run Dhnanjay/qwen3.5-lite

Details

19 hours ago

61541f3ba37b · 3.4GB ·

qwen35
·
4.66B
·
Q4_K_M
{{- $lastUserIdx := -1 -}} {{- range $idx, $msg := .Messages -}} {{- if eq $msg.Role "user" }}{{ $la
Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US
You are conversational assistant. Keep responses short, direct and to the point - this is a chat not
{ "presence_penalty": 1.5, "temperature": 1, "top_k": 20, "top_p": 0.95 }

Readme

qwen3.5-lite (Evident)

A fast, non-thinking variant of Qwen 3.5 optimized for local document Q&A in Evident.

What it does

  • Produces direct answers (no hidden “thinking” output)
  • Optimized for speed and low memory usage
  • Works well with full-document context injection

Why this model

  • Small footprint (~4B) — runs smoothly on local machines
  • Stable output format for UI-driven apps
  • Avoids empty responses caused by reasoning-only modes

Best for

  • Local document search and Q&A
  • Accounting and operations playbooks
  • Fast, evidence-based responses inside Evident

Base model

  • schien/qwen3.5-lite

If you want it slightly more “product-branded” (stronger positioning), I can tighten it further.