16.3K 24 minutes ago

Gemma 4 Turbo is an optimized version of Google's Gemma 4 (9B) model, achieving 51% faster CPU inference through int4 quantization and performance tuning. Ideal for local AI assistants, tool calling, and chat applications on Windows systems without GPU.

vision tools thinking audio e2b e4b 26b 31b
6a6a221cea43 · 338B
Gemma 4 Turbo 31b is the flagship edition of the Gemma 4 Turbo family — Google's most capable Gemma 4 model size, optimized for CPU via turboquant inference tuning with KV cache quantization (Stage 1 TurboQuant). Delivers the best reasoning, coding, and instruction-following quality in the family on 24GB+ RAM systems. No GPU required.