160 3 days ago

Fully uncensored local AI for coding, automation, vision tasks, and direct final answers, built to reduce unnecessary thinking output and deliver complete responses.

vision tools thinking
ollama run studiobrn/uncensoredmodAI

Details

3 days ago

f12ec0a1f258 · 18GB ·

gemma4
·
25.8B
·
Q4_K_M
You are a direct coding assistant. Never output thinking, reasoning traces, analysis, chain-of-thoug
{ "num_ctx": 4096, "num_predict": 2048, "repeat_penalty": 1.05, "stop": [ "<

Readme

uncensoredmodAI

uncensoredmodAI is a direct-response Ollama model profile designed for coding, technical assistance, automation workflows, and practical local AI usage.

This model is optimized for users who want clear final answers, complete code output, and less unnecessary explanation.

Model Details

  • Architecture: Gemma 4
  • Parameters: 25.8B
  • Quantization: Q4_K_M
  • Context length: 262K supported by the base model
  • Capabilities: Text, Vision, Tools, Thinking
  • Recommended mode: --think=false for direct answers

Run

ollama run studiobrn/uncensoredmodai

For direct final answers without visible thinking output:

ollama run studiobrn/uncensoredmodai --think=false

Recommended Local Settings

For smoother usage on lower-memory machines, especially Apple Silicon / 16GB RAM systems:

OLLAMA_CONTEXT_LENGTH=4096 \
OLLAMA_NUM_PARALLEL=1 \
OLLAMA_MAX_LOADED_MODELS=1 \
OLLAMA_FLASH_ATTENTION=1 \
OLLAMA_KV_CACHE_TYPE=q4_0 \
OLLAMA_KEEP_ALIVE=1m \
ollama serve

Then run the model in another terminal:

ollama run studiobrn/uncensoredmodai --think=false

Best Prompt Style for Coding

Use direct prompts like:

Do not output thinking.
Do not output reasoning traces.
Only provide the final answer.
Write complete code.
Do not stop in the middle of a function, class, file, JSON, or command.

Example:

Write a complete Node.js Express API in one file. Only return the code.

Optimized For

  • Coding assistance
  • Technical Q&A
  • Local AI experimentation
  • Automation workflows
  • Tool-based assistant usage
  • Direct final-answer responses
  • Vision-supported tasks
  • Long-form structured answers

Notes

This is a large 25.8B model. On 16GB Apple Silicon machines, it may use partial CPU/GPU offloading, so response speed can vary.

For best results:

  • Use --think=false
  • Keep context controlled
  • Increase num_predict when longer code output is needed
  • Use smaller context settings for faster responses
  • Use larger context settings only when working with long files or detailed coding tasks

Example API Usage

curl http://localhost:11434/api/chat -d '{
  "model": "studiobrn/uncensoredmodai",
  "messages": [
    {
      "role": "user",
      "content": "Write a clean Python FastAPI example. Only return the code."
    }
  ],
  "think": false,
  "stream": true,
  "options": {
    "num_ctx": 4096,
    "num_predict": 2048,
    "temperature": 0.1
  }
}'

Purpose

uncensoredmodAI is built for direct, practical, uncensored, and productive local AI workflows.

The main goal is simple:

Fully uncensored local AI. Less unnecessary thinking output. More complete final answers. Better assistance.