Fully uncensored local AI for coding, automation, vision tasks, and direct final answers, built to reduce unnecessary thinking output and deliver complete responses.

Details

Updated 3 days ago

3 days ago

f12ec0a1f258 · 18GB ·

model

archgemma4

parameters25.8B

quantizationQ4_K_M

18GB

system

You are a direct coding assistant. Never output thinking, reasoning traces, analysis, chain-of-thoug

381B

params

{ "num_ctx": 4096, "num_predict": 2048, "repeat_penalty": 1.05, "stop": [ "<

129B

uncensoredmodAI

uncensoredmodAI is a direct-response Ollama model profile designed for coding, technical assistance, automation workflows, and practical local AI usage.

This model is optimized for users who want clear final answers, complete code output, and less unnecessary explanation.

Model Details

Architecture: Gemma 4
Parameters: 25.8B
Quantization: Q4_K_M
Context length: 262K supported by the base model
Capabilities: Text, Vision, Tools, Thinking
Recommended mode: --think=false for direct answers

Run

ollama run studiobrn/uncensoredmodai

For direct final answers without visible thinking output:

ollama run studiobrn/uncensoredmodai --think=false

Recommended Local Settings

For smoother usage on lower-memory machines, especially Apple Silicon / 16GB RAM systems:

OLLAMA_CONTEXT_LENGTH=4096 \
OLLAMA_NUM_PARALLEL=1 \
OLLAMA_MAX_LOADED_MODELS=1 \
OLLAMA_FLASH_ATTENTION=1 \
OLLAMA_KV_CACHE_TYPE=q4_0 \
OLLAMA_KEEP_ALIVE=1m \
ollama serve

Then run the model in another terminal:

ollama run studiobrn/uncensoredmodai --think=false

Best Prompt Style for Coding

Use direct prompts like:

Do not output thinking.
Do not output reasoning traces.
Only provide the final answer.
Write complete code.
Do not stop in the middle of a function, class, file, JSON, or command.

Example:

Write a complete Node.js Express API in one file. Only return the code.

Optimized For

Coding assistance
Technical Q&A
Local AI experimentation
Automation workflows
Tool-based assistant usage
Direct final-answer responses
Vision-supported tasks
Long-form structured answers

Notes

This is a large 25.8B model. On 16GB Apple Silicon machines, it may use partial CPU/GPU offloading, so response speed can vary.

For best results:

Use --think=false
Keep context controlled
Increase num_predict when longer code output is needed
Use smaller context settings for faster responses
Use larger context settings only when working with long files or detailed coding tasks

Example API Usage

curl http://localhost:11434/api/chat -d '{
  "model": "studiobrn/uncensoredmodai",
  "messages": [
    {
      "role": "user",
      "content": "Write a clean Python FastAPI example. Only return the code."
    }
  ],
  "think": false,
  "stream": true,
  "options": {
    "num_ctx": 4096,
    "num_predict": 2048,
    "temperature": 0.1
  }
}'

Purpose

uncensoredmodAI is built for direct, practical, uncensored, and productive local AI workflows.

The main goal is simple:

Fully uncensored local AI. Less unnecessary thinking output. More complete final answers. Better assistance.