1,158 Downloads Updated 2 months ago
ollama run brnpistone/Qwen3-4B-AgentCoder-q6-k
ollama launch claude --model brnpistone/Qwen3-4B-AgentCoder-q6-k
ollama launch codex --model brnpistone/Qwen3-4B-AgentCoder-q6-k
ollama launch opencode --model brnpistone/Qwen3-4B-AgentCoder-q6-k
ollama launch openclaw --model brnpistone/Qwen3-4B-AgentCoder-q6-k
Qwen3-4B-AgentCoder-GGUF is a quantized, version of the Qwen3-4B-AgentCoder model, using llama.cpp. This model is optimized for: - ๐งฎ Complex reasoning tasks - ๐งฐ Tool calling - ๐ป Code generation
The model was developed through sequential fine-tuning, followed by a Direct Preference Optimization (DPO) post-training stage to improve alignment, coherence, and reasoning accuracy.
Qwen3-4B-AgentCoder-GGUF can be used directly for: - โ Tool calling in complex reasoning tasks - โ Code generation for Python, JS, and other languages - โ Multi-domain reasoning (math, logic, Q&A)
1e-54431000.01interstellarninja/hermes_reasoning_tool_use (~51K) โ multi-turn tool use3e-51821000.05ise-uiuc/Magicoder-OSS-Instruct-75K (~38K) โ code generationopen-thoughts/OpenThoughts-114k (~37K) โ general reasoninginterstellarninja/hermes_reasoning_tool_use (~30K) โ tool usecustom/dpo-toolcode-alignment (~15K) โ DPO preference pairsAfter sequential fine-tuning, the model underwent a DPO phase to enhance response alignment, reasoning robustness, and factual consistency.
1e-6285sigmoid2Objective - Encourage the model to prefer chosen completions - Improve clarity, correctness, and helpfulness - Reduce hallucinations and verbosity
The model was evaluated on multiple benchmarks to assess its capabilities across different domains:
| Benchmark | Score | Details |
|---|---|---|
| HumanEval | 72.0% | Base tests |
| HumanEval+ | 68.5% | Base + extra tests |
| GSM8K | 82.0% | 1082โ1319 correct |
| MMLU | 77.7% | 1190โ1531 correct (validation split) |
| Multi-turn Tool Calling | 70.0% | 70โ100 correct |
openai/openai_humaneval - 164 hand-written programming problemsopenai/gsm8k (test split) - 1,319 grade school math word problemscais/mmlu (validation split) - 1,531 multiple-choice questions across 57 subjectsHardware
- GPU: NVIDIA H100 (80 GB VRAM)
- System RAM: 2 TiB
- Memory per vCPU: 10.67 GiB
Software
- Python: 3.12
- Transformers: 4.55.0
- Libraries: bitsandbytes, safetensors, torch, trl, scikit-learn, tokenizers, psutil, py7zr
๐ง Qwen3-4B-AgentCoder-GGUF โ created by Bruno Pistone
Enhanced reasoning, tool calling, and code generation โ refined with DPO alignment