Details

Updated 8 hours ago

8 hours ago

7d498121e4e7 · 3.4GB ·

model

archqwen35

parameters4.54B

quantizationQ4_K_M

3.4GB

system

/nothink You are a coding agent running inside Claude Code. Be concise. Avoid loops. Use tools only

264B

license

Credits to Jackrong. Base model: Qwopus3.5 4B Original model: Qwen3.5 4B

78B

params

{ "num_ctx": 32000, "presence_penalty": 1.5, "repeat_last_n": 2048, "repeat_penalty"

134B

template

13B

CCode79 v2 - 4B param, Q4, 32K ctx, Local/Offline, 8GB GPU

Custom Ollama model, fine-tuned from Qwopus3.5-4B from Jackrong, configured for local coding-agent workflows, especially with Claude Code.

This model is based on a 4B parameter LLM, quantized in Q4, and configured with a large context window for software development tasks. It is intended to provide a practical balance between performance, memory usage, and code-assistance quality on local hardware.

Model details

Type: Text/image model
Size: 4B parameters
Quantization: Q4
Context target: 32K
Real GPU memory usage: 7,1 GB VRAM
Recommended GPU memory: 8 GB VRAM
Main focus: Coding and agentic development workflows
Tool use: Supported, depending on the client/application
Thinking/reasoning mode: Supported, depending on the client/application

Intended use

This model is designed for:

Claude Code workflows
Local coding assistants
Code analysis
Debugging support
Refactoring suggestions
Project exploration
Terminal-based programming tasks
Educational demonstrations of AI coding agents

Custom model v2 for Claude Code to use locally with 8gb GPUs (maybe tiny miracles?...)

Details

Readme

CCode79 v2 - 4B param, Q4, 32K ctx, Local/Offline, 8GB GPU

Model details

Intended use