63 Downloads Updated 2 weeks ago
Updated 2 weeks ago
2 weeks ago
793ac3cbe314 · 14GB ·
WEIRD COMPOUND (VERSION 1.7) / I-MATRIX / 24B / I-QUANT
As of December 1st 2025, this model is the 20th highest performing non-proprietary LLM for writing/storytelling on the Huggingface UGI leaderboard, beating models hundreds of billions of parameters larger, and scoring virtually identical to the proprietary GPT 5 with reasoning and the full 671 billion parameter Deepseek 3 0324. No higher ranking models approach this model’s size. To stuff as many parameters in as little VRAM as possible, weighted I-quants will be listed.
Note that I-quants forfeit some token generation speed relative to K-quants in exchange for storage efficiency. The 4-bit medium K-quant works on 16GB GPUs. For users of this model to experience the full performance, the 8-bit version will be included. Note that the 8-bit model cannot be fully offloaded onto a GPU with 24GB of VRAM or less. Weighted quants differ from static quants with the ‘importance’ of each weight with the lowering of the bit depth taken into consideration; as the 8-bit model is the ‘full’ model, weighted quants are not required, and the uploaded Q8_0 model is a static quant. These models were taken from GGUF formats from Huggingface.
Original model (FlareRebellion):
GGUF weighted quantizations (mradermacher):
GGUF static quantizations (mradermacher):