53 Downloads Updated 2 weeks ago
Name
9 models
MARS-Gemma3-28b:Q2_K
11GB · 128K context window · Text · 2 weeks ago
MARS-Gemma3-28b:Q2_K_S
10GB · 128K context window · Text · 2 weeks ago
MARS-Gemma3-28b:Q3_K_S
13GB · 128K context window · Text · 2 weeks ago
MARS-Gemma3-28b:Q3_K_M
14GB · 128K context window · Text · 2 weeks ago
MARS-Gemma3-28b:Q4_K_M
17GB · 128K context window · Text · 2 weeks ago
MARS-Gemma3-28b:Q5_K_M
20GB · 128K context window · Text · 2 weeks ago
MARS-Gemma3-28b:IQ3_XXS
11GB · 128K context window · Text · 2 weeks ago
MARS-Gemma3-28b:IQ3_S
13GB · 128K context window · Text · 2 weeks ago
MARS-Gemma3-28b:IQ4_XS
16GB · 128K context window · Text · 2 weeks ago
MARS / I-MATRIX / 28B / I-QUANT
Relatively new model as of November 2025, with claims of beating Mistral 24b, as well as interpreting complex character cards. To stuff as many parameters in as little VRAM as possible, weighted K and I-quants will be listed, along with multiple distillations as the original creator of the model has offered many.
Note that I-quants forfeit some token generation speed relative to K-quants in exchange for storage efficiency. The creator of this model has mentioned this model being heavy on memory usage, specifically not being able to fit the model into VRAM at only 8k context with the 4-bit small quantization.
Although said quantization is considered the ‘standard’ when running roleplay models locally, 3-bit models are become increasingly usable at higher parameters compared to 4-bit models - to the point that the 3-bit small quantization should be performant at 28b (above 18b, from my experience). The importance matrix and I-quants included will see to remedying that regardless.
The small 3-bit quants, is recommended for 16GB GPUs. These models were taken from GGUF formats from Huggingface.
GGUF weighted quantizations (mradermacher):
[No obligatory model picture. Mars did not have one.]