I’m excited to share a new antigenic fine-tune of Gemma-4-12B designed specifically for tool-calling and raw reasoning loops on Apple Silicon.

tools thinking

ollama run cyborgxx101/gemma-4-12b-opus-finetuned-mlx

curl http://localhost:11434/api/chat \
  -d '{
    "model": "cyborgxx101/gemma-4-12b-opus-finetuned-mlx",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='cyborgxx101/gemma-4-12b-opus-finetuned-mlx',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'cyborgxx101/gemma-4-12b-opus-finetuned-mlx',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Applications

Claude Code ollama launch claude --model cyborgxx101/gemma-4-12b-opus-finetuned-mlx

OpenCode ollama launch opencode --model cyborgxx101/gemma-4-12b-opus-finetuned-mlx

Hermes Agent ollama launch hermes --model cyborgxx101/gemma-4-12b-opus-finetuned-mlx

OpenClaw ollama launch openclaw --model cyborgxx101/gemma-4-12b-opus-finetuned-mlx

Models

View all →

Name

3 models

Size / Usage

Context

Input

gemma-4-12b-opus-finetuned-mlx:latest

24GB · 128K context window · Text · 1 month ago

gemma-4-12b-opus-finetuned-mlx:latest

24GB

128K

Text

gemma-4-12b-opus-finetuned-mlx:4bit

7.4GB · 128K context window · Text · 1 month ago

gemma-4-12b-opus-finetuned-mlx:4bit

7.4GB

128K

Text

gemma-4-12b-opus-finetuned-mlx:8bit

13GB · 128K context window · Text · 1 month ago

gemma-4-12b-opus-finetuned-mlx:8bit

13GB

128K

Text

Readme

license: gemma base_model: mlx-community/gemma-4-12b-it-bf16 tags: - mlx - gemma - code - reasoning - agent - finetuned language: - en pipeline_tag: text-generation

library_name: transformers

gemma-4-12b-opus-finetuned-mlx

Fine-tuned mlx-community/gemma-4-12b-it-bf16 on Claude Opus 4.⁶⁄₄.7 reasoning data with R1-distilled code examples.

Variants

cyborgxx101/gemma-4-12b-opus-finetuned-mlx (16bit, 22GB)
cyborgxx101/gemma-4-12b-opus-finetuned-mlx-8bit (~12GB)
cyborgxx101/gemma-4-12b-opus-finetuned-mlx-4bit (~7GB)

Inference

With oMLX (recommended for Apple Silicon)

”`bash omlx serve curl http://localhost:8000/v1/chat/completions
-H “Content-Type: application/json”
-d ‘{“model”: “cyborgxx101/gemma-4-12b-opus-finetuned-mlx-8bit”, “messages”: [{“role”: “user”, “content”: “Analyze the database schema and construct an optimized multi-join query.”}]}’