14 yesterday

A lightweight, FIM (Fill-In-the-Middle) optimized variant of Qwen2.5-Coder-0.5B-Instruct using the fp16 GGUF quantization from HuggingFace. At only ~1 GB, it fits comfortably on any 8 GB single GPU with headroom for 8K context.