26 7 hours ago

Custom model v2 for Claude Code to use locally with 8gb GPUs (maybe tiny miracles?...)

vision tools thinking
ollama run SetneufPT/ccode79v2_4b_q4_32k_8gb-gpu

Applications

Claude Code
Claude Code ollama launch claude --model SetneufPT/ccode79v2_4b_q4_32k_8gb-gpu
Codex App
Codex App ollama launch codex-app --model SetneufPT/ccode79v2_4b_q4_32k_8gb-gpu
OpenClaw
OpenClaw ollama launch openclaw --model SetneufPT/ccode79v2_4b_q4_32k_8gb-gpu
Hermes Agent
Hermes Agent ollama launch hermes --model SetneufPT/ccode79v2_4b_q4_32k_8gb-gpu
Codex
Codex ollama launch codex --model SetneufPT/ccode79v2_4b_q4_32k_8gb-gpu
OpenCode
OpenCode ollama launch opencode --model SetneufPT/ccode79v2_4b_q4_32k_8gb-gpu

Models

View all →

Readme


CCode79 v2 - 4B param, Q4, 32K ctx, Local/Offline, 8GB GPU

Custom Ollama model, fine-tuned from Qwopus3.5-4B from Jackrong, configured for local coding-agent workflows, especially with Claude Code.

This model is based on a 4B parameter LLM, quantized in Q4, and configured with a large context window for software development tasks. It is intended to provide a practical balance between performance, memory usage, and code-assistance quality on local hardware.

Model details

  • Type: Text/image model
  • Size: 4B parameters
  • Quantization: Q4
  • Context target: 32K
  • Real GPU memory usage: 7,1 GB VRAM
  • Recommended GPU memory: 8 GB VRAM
  • Main focus: Coding and agentic development workflows
  • Tool use: Supported, depending on the client/application
  • Thinking/reasoning mode: Supported, depending on the client/application

Intended use

This model is designed for:

  • Claude Code workflows
  • Local coding assistants
  • Code analysis
  • Debugging support
  • Refactoring suggestions
  • Project exploration
  • Terminal-based programming tasks
  • Educational demonstrations of AI coding agents

image.png