16.3K 22 minutes ago

Gemma 4 Turbo is an optimized version of Google's Gemma 4 (9B) model, achieving 51% faster CPU inference through int4 quantization and performance tuning. Ideal for local AI assistants, tool calling, and chat applications on Windows systems without GPU.

vision tools thinking audio e2b e4b 26b 31b
d730a12d438d · 1.0kB
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
Copyright 2024 Google LLC
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
---
This is an optimized derivative work based on google/gemma-4-31b-it.
Modifications: int4 quantization and CPU inference optimization via turboquant.
Modified by: ssfdre38
Date: 2026