260 2 weeks ago

A structurally extracted, text-only iteration of Google's multimodal gemma-4-E4B-it model. Vision and audio encoders have been fully decoupled to minimize VRAM footprint for text-centric workloads. System Prompt to address lost abilities.

tools thinking
ollama run fauxpaslife/gemma-4-E4B-it-textonly-sysprmpt-Q4_K_M

Applications

Claude Code
Claude Code ollama launch claude --model fauxpaslife/gemma-4-E4B-it-textonly-sysprmpt-Q4_K_M
Codex
Codex ollama launch codex --model fauxpaslife/gemma-4-E4B-it-textonly-sysprmpt-Q4_K_M
OpenCode
OpenCode ollama launch opencode --model fauxpaslife/gemma-4-E4B-it-textonly-sysprmpt-Q4_K_M
OpenClaw
OpenClaw ollama launch openclaw --model fauxpaslife/gemma-4-E4B-it-textonly-sysprmpt-Q4_K_M
Hermes Agent
Hermes Agent ollama launch hermes --model fauxpaslife/gemma-4-E4B-it-textonly-sysprmpt-Q4_K_M

Models

View all →

Readme

Screenshot 2026-04-10 221131.jpg

Acquired from HF. Generated via the Kitsune Fine Tuning Suite. Tested in Ollama.

image.png

Upon import to Ollama, I added a template for the GGUF to turn properly.

I also attempted to add a default system prompt, correcting the model’s lack of multimodal abilities. Time will tell ;)

🦊💖🦙

Model Card From HF : ozgurpolat/gemma-4-E4B-it-text-only-GGUF

Gemma 4 E4B (Text-Only) - GGUF This repository provides a structurally extracted, text-only iteration of Google’s multimodal gemma-4-E4B-it model. Vision and audio encoders have been fully decoupled to minimize VRAM footprint for text-centric workloads.

Model Format Serialization: GGUF (gemma4 architecture layout) Quantization: Q4_K_M Base Parameters: 8B (Text layer extraction) Note on Zero-Shot Modality Queries: The text parameters retain their original RLHF conditioning. The model will assert multimodal capabilities (e.g., confirming it can interpret images) despite hardware encoders being purged. Overriding this behavior requires explicit bounding via the system prompt.