100 Downloads Updated 2 weeks ago
Name
9 models
THE_OMEGA_DIRECTIVE-Mistral_Small3.2-24b:Q3_K_S
10GB · 128K context window · Text · 2 weeks ago
THE_OMEGA_DIRECTIVE-Mistral_Small3.2-24b:Q4_K_S
14GB · 128K context window · Text · 2 weeks ago
THE_OMEGA_DIRECTIVE-Mistral_Small3.2-24b:Q4_K_M
14GB · 128K context window · Text · 2 weeks ago
THE_OMEGA_DIRECTIVE-Mistral_Small3.2-24b:Q5_K_M
17GB · 128K context window · Text · 2 weeks ago
THE_OMEGA_DIRECTIVE-Mistral_Small3.2-24b:Q6_K
19GB · 128K context window · Text · 2 weeks ago
THE_OMEGA_DIRECTIVE-Mistral_Small3.2-24b:IQ2_XXS
6.5GB · 128K context window · Text · 2 weeks ago
THE_OMEGA_DIRECTIVE-Mistral_Small3.2-24b:IQ3_XXS
9.3GB · 128K context window · Text · 2 weeks ago
THE_OMEGA_DIRECTIVE-Mistral_Small3.2-24b:IQ3_S
10GB · 128K context window · Text · 2 weeks ago
THE_OMEGA_DIRECTIVE-Mistral_Small3.2-24b:IQ4_XS
13GB · 128K context window · Text · 2 weeks ago
THE OMEGA DIRECTIVE / I-MATRIX / 24B / I-QUANT
The creator has trained this model on multi-turn chat data, and has taken out all cliche LLM response mannerisms, earning it the “unslop” moniker. I can attest to this. Also, through testing this model myself, I’ve found this model to keep characters distinct in both single and group chats. To stuff as many parameters in as little VRAM as possible, weighted I-quants will be listed.
Note that I-quants forfeit some token generation speed relative to K-quants in exchange for storage efficiency. Any 4-bit, or smaller, model will work on 16GB GPUs, though the small K-quant is recommended, even over the I-quant, for speed. These models were taken from GGUF formats from Huggingface.
GGUF weighted quantizations (mradermacher):