100 Downloads Updated 1 month ago
ollama run VenomBlood/yicoder-q4
A quantized version of 01-ai/Yi-Coder-9B-Chat, converted to GGUF Q4_K_M format using llama.cpp. Optimized for deterministic code generation and test writing on consumer hardware.
| Property | Value |
|---|---|
| Base Model | 01-ai/Yi-Coder-9B-Chat |
| Format | GGUF Q4_K_M |
| Quantization | Q4_K_M (via llama.cpp) |
| File Size | ~5.3 GB |
| RAM Required | ~8 GB |
| Context Window | 8192 tokens (supports 128k) |
| License | Apache 2.0 |
ollama run yourusername/yicoder-q4
| Parameter | Value | Reason |
|---|---|---|
| temperature | 0.1 | Deterministic code output |
| top_p | 0.9 | Focused sampling |
| repeat_penalty | 1.1 | Reduces repetition in code |
| num_ctx | 8192 | Safe context size for code tasks |
This model is configured as an expert Python software engineer specializing in writing comprehensive pytest test suites. It returns only valid Python code with no markdown fences or explanations.
>>> Write a pytest test for this function:
def add(a: int, b: int) -> int:
return a + b
def test_add():
assert add(1, 2) == 3
assert add(-1, 1) == 0
assert add(-1, -2) == -3
Quantized on Kaggle (Tesla T4 x2) using the following pipeline:
llama.cppllama-cli before publishingQ8_0 was used as intermediate (instead of F16 at 18 GB) to fit within Kaggle’s free-tier disk limits.
# Download GGUF + Modelfile
huggingface-cli download AkshajSeerpu/yi-coder-9b-q4km-gguf \
yi-coder-9b-q4_k_m.gguf Modelfile --local-dir ./
# Import into Ollama
ollama create yicoder-q4 -f Modelfile
# Run
ollama run yicoder-q4