I’m excited to share a new antigenic fine-tune of Gemma-4-12B designed specifically for tool-calling and raw reasoning loops on Apple Silicon.

tools thinking

ollama run cyborgxx101/gemma-4-12b-opus-finetuned-mlx:8bit

curl http://localhost:11434/api/chat \
  -d '{
    "model": "cyborgxx101/gemma-4-12b-opus-finetuned-mlx:8bit",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='cyborgxx101/gemma-4-12b-opus-finetuned-mlx:8bit',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'cyborgxx101/gemma-4-12b-opus-finetuned-mlx:8bit',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 month ago

1 month ago

ab9ed3fe14b6 · 13GB ·

model

archgemma4

parameters11.9B

quantizationQ8_0

13GB

template

<|start|>turn|user|{{ .Prompt }}<|channel|>assistant\n{{ .Response }}<|end|>

76B

system

<|think|>\nYou are a helpful AI assistant. For every request, first think step by step, then provide

218B

params

{ "num_ctx": 8192, "stop": [ "<|end|>", "Task complete." ], "tempera

81B

Readme

license: gemma base_model: mlx-community/gemma-4-12b-it-bf16 tags: - mlx - gemma - code - reasoning - agent - finetuned language: - en pipeline_tag: text-generation

library_name: transformers

gemma-4-12b-opus-finetuned-mlx

Fine-tuned mlx-community/gemma-4-12b-it-bf16 on Claude Opus 4.⁶⁄₄.7 reasoning data with R1-distilled code examples.

Variants

cyborgxx101/gemma-4-12b-opus-finetuned-mlx (16bit, 22GB)
cyborgxx101/gemma-4-12b-opus-finetuned-mlx-8bit (~12GB)
cyborgxx101/gemma-4-12b-opus-finetuned-mlx-4bit (~7GB)

Inference

With oMLX (recommended for Apple Silicon)

”`bash omlx serve curl http://localhost:8000/v1/chat/completions
-H “Content-Type: application/json”
-d ‘{“model”: “cyborgxx101/gemma-4-12b-opus-finetuned-mlx-8bit”, “messages”: [{“role”: “user”, “content”: “Analyze the database schema and construct an optimized multi-join query.”}]}’