LESSTHANSUPER/MARS-Gemma3-28b

LESSTHANSUPER/ MARS-Gemma3-28b

186 Downloads Updated 3 months ago

Complexity-focused roleplay model based on pre-trained Gemma 3 models. Made by OddTheGreat (Huggingface).

tools

ollama run LESSTHANSUPER/MARS-Gemma3-28b:Q2_K

curl http://localhost:11434/api/chat \
  -d '{
    "model": "LESSTHANSUPER/MARS-Gemma3-28b:Q2_K",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='LESSTHANSUPER/MARS-Gemma3-28b:Q2_K',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'LESSTHANSUPER/MARS-Gemma3-28b:Q2_K',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Applications

Claude Code

Claude Code ollama launch claude --model LESSTHANSUPER/MARS-Gemma3-28b:Q2_K

Codex

Codex ollama launch codex --model LESSTHANSUPER/MARS-Gemma3-28b:Q2_K

OpenCode

OpenCode ollama launch opencode --model LESSTHANSUPER/MARS-Gemma3-28b:Q2_K

OpenClaw

OpenClaw ollama launch openclaw --model LESSTHANSUPER/MARS-Gemma3-28b:Q2_K

Models

Name

9 models

Size

Context

Input

MARS-Gemma3-28b:Q2_K

11GB · 128K context window · Text · 3 months ago

MARS-Gemma3-28b:Q2_K

11GB

128K

Text

MARS-Gemma3-28b:Q2_K_S

10GB · 128K context window · Text · 3 months ago

MARS-Gemma3-28b:Q2_K_S

10GB

128K

Text

MARS-Gemma3-28b:Q3_K_S

13GB · 128K context window · Text · 3 months ago

MARS-Gemma3-28b:Q3_K_S

13GB

128K

Text

MARS-Gemma3-28b:Q3_K_M

14GB · 128K context window · Text · 3 months ago

MARS-Gemma3-28b:Q3_K_M

14GB

128K

Text

MARS-Gemma3-28b:Q4_K_M

17GB · 128K context window · Text · 3 months ago

MARS-Gemma3-28b:Q4_K_M

17GB

128K

Text

MARS-Gemma3-28b:Q5_K_M

20GB · 128K context window · Text · 3 months ago

MARS-Gemma3-28b:Q5_K_M

20GB

128K

Text

MARS-Gemma3-28b:IQ3_XXS

11GB · 128K context window · Text · 3 months ago

MARS-Gemma3-28b:IQ3_XXS

11GB

128K

Text

MARS-Gemma3-28b:IQ3_S

13GB · 128K context window · Text · 3 months ago

MARS-Gemma3-28b:IQ3_S

13GB

128K

Text

MARS-Gemma3-28b:IQ4_XS

16GB · 128K context window · Text · 3 months ago

MARS-Gemma3-28b:IQ4_XS

16GB

128K

Text

Readme

MARS / I-MATRIX / 28B / I-QUANT

Relatively new model as of November 2025, with claims of beating Mistral 24b, as well as interpreting complex character cards. To stuff as many parameters in as little VRAM as possible, weighted K and I-quants will be listed, along with multiple distillations as the original creator of the model has offered many.

Note that I-quants forfeit some token generation speed relative to K-quants in exchange for storage efficiency. The creator of this model has mentioned this model being heavy on memory usage, specifically not being able to fit the model into VRAM at only 8k context with the 4-bit small quantization.

Although said quantization is considered the ‘standard’ when running roleplay models locally, 3-bit models are become increasingly usable at higher parameters compared to 4-bit models - to the point that the 3-bit small quantization should be performant at 28b (above 18b, from my experience). The importance matrix and I-quants included will see to remedying that regardless.

The small 3-bit quants, is recommended for 16GB GPUs. These models were taken from GGUF formats from Huggingface.

Original model (OddTheGreat):

GGUF weighted quantizations (mradermacher):

[No obligatory model picture. Mars did not have one.]

**MARS / I-MATRIX / 28B / I-QUANT**

Relatively new model as of November 2025, with claims of beating Mistral 24b, as well as interpreting complex character cards. To stuff as many parameters in as little VRAM as possible, weighted K and I-quants will be listed, along with multiple distillations as the original creator of the model has offered many.

Note that I-quants forfeit some token generation speed relative to K-quants in exchange for storage efficiency. The creator of this model has mentioned this model being heavy on memory usage, specifically not being able to fit the model into VRAM at only 8k context with the 4-bit small quantization.

Although said quantization is considered the 'standard' when running roleplay models locally, 3-bit models are become increasingly usable at higher parameters compared to 4-bit models - to the point that the 3-bit small quantization should be performant at 28b (above 18b, from my experience). The importance matrix and I-quants included will see to remedying that regardless.

The small 3-bit quants, is recommended for 16GB GPUs. These models were taken from GGUF formats from Huggingface.

[*Original model (OddTheGreat):*](https://huggingface.co/OddTheGreat/Mars_27B_V.1)

[*GGUF weighted quantizations (mradermacher):*](https://huggingface.co/mradermacher/Mars_27B_V.1-i1-GGUF)

_[No obligatory model picture. Mars did not have one.]_

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)