100 Downloads Updated 2 weeks ago
Updated 2 weeks ago
2 weeks ago
2a3f87445fdb · 14GB ·
THE OMEGA DIRECTIVE / I-MATRIX / 24B / I-QUANT
The creator has trained this model on multi-turn chat data, and has taken out all cliche LLM response mannerisms, earning it the “unslop” moniker. I can attest to this. Also, through testing this model myself, I’ve found this model to keep characters distinct in both single and group chats. To stuff as many parameters in as little VRAM as possible, weighted I-quants will be listed.
Note that I-quants forfeit some token generation speed relative to K-quants in exchange for storage efficiency. Any 4-bit, or smaller, model will work on 16GB GPUs, though the small K-quant is recommended, even over the I-quant, for speed. These models were taken from GGUF formats from Huggingface.
GGUF weighted quantizations (mradermacher):