Dhnanjay/qwen3.5-lite

Dhnanjay/ qwen3.5-lite

17 Downloads Updated 18 hours ago

Non-thinking qwen3.5:4b tuned for concise conversational chat with verbatim short code preservation.

vision tools

ollama run Dhnanjay/qwen3.5-lite

curl http://localhost:11434/api/chat \
  -d '{
    "model": "Dhnanjay/qwen3.5-lite",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='Dhnanjay/qwen3.5-lite',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'Dhnanjay/qwen3.5-lite',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Applications

Claude Code

Claude Code ollama launch claude --model Dhnanjay/qwen3.5-lite

OpenClaw

OpenClaw ollama launch openclaw --model Dhnanjay/qwen3.5-lite

Hermes Agent

Hermes Agent ollama launch hermes --model Dhnanjay/qwen3.5-lite

Codex

Codex ollama launch codex --model Dhnanjay/qwen3.5-lite

OpenCode

OpenCode ollama launch opencode --model Dhnanjay/qwen3.5-lite

Models

Name

1 model

Size

Context

Input

qwen3.5-lite:latest

3.4GB · 256K context window · Text, Image · 18 hours ago

qwen3.5-lite:latest

3.4GB

256K

Text, Image

Readme

qwen3.5-lite (Evident)

A fast, non-thinking variant of Qwen 3.5 optimized for local document Q&A in Evident.

What it does

Produces direct answers (no hidden “thinking” output)
Optimized for speed and low memory usage
Works well with full-document context injection

Why this model

Small footprint (~4B) — runs smoothly on local machines
Stable output format for UI-driven apps
Avoids empty responses caused by reasoning-only modes

Best for

Local document search and Q&A
Accounting and operations playbooks
Fast, evidence-based responses inside Evident

Base model

schien/qwen3.5-lite

If you want it slightly more “product-branded” (stronger positioning), I can tighten it further.