guzesqdro/Claude_Sonnet_4.6

guzesqdro/ Claude_Sonnet_4.6_Reduced

1,984 Downloads Updated 3 months ago

Claude Sonnet 4.6 is a balanced AI model for reasoning, coding, and general tasks. It offers strong instruction following, clear responses, and good performance in both creative and technical use cases. [Un-official version] Go to the Readme for more info

tools

ollama run guzesqdro/Claude_Sonnet_4.6_Reduced

curl http://localhost:11434/api/chat \
  -d '{
    "model": "guzesqdro/Claude_Sonnet_4.6_Reduced",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='guzesqdro/Claude_Sonnet_4.6_Reduced',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'guzesqdro/Claude_Sonnet_4.6_Reduced',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Applications

Claude Code ollama launch claude --model guzesqdro/Claude_Sonnet_4.6_Reduced

Codex App ollama launch codex-app --model guzesqdro/Claude_Sonnet_4.6_Reduced

OpenClaw ollama launch openclaw --model guzesqdro/Claude_Sonnet_4.6_Reduced

Hermes Agent ollama launch hermes --model guzesqdro/Claude_Sonnet_4.6_Reduced

Codex ollama launch codex --model guzesqdro/Claude_Sonnet_4.6_Reduced

OpenCode ollama launch opencode --model guzesqdro/Claude_Sonnet_4.6_Reduced

Models

View all →

Name

1 model

Size / Usage

Context

Input

Claude_Sonnet_4.6_Reduced:latest

986MB · 32K context window · Text · 3 months ago

Claude_Sonnet_4.6_Reduced:latest

986MB

32K

Text

Readme

This is an optimized lightweight version of a Claude Sonnet–style assistant designed for fast and efficient performance on older or low-end devices. It prioritizes speed, reduced memory usage, and quick response generation while maintaining high-quality natural language output.

The system is powered by a Qwen 2.5-based architecture, fine-tuned and adapted to deliver concise, coherent, and context-aware responses. It has been optimized to reduce latency and computational load, making it suitable for environments where full-scale large language models may be too resource-intensive.

Rather than replicating any specific proprietary model, this version is inspired by modern conversational AI assistants and focuses on delivering a similar user experience in terms of clarity, reasoning ability, and helpfulness, while remaining lightweight and efficient.

Made with ❤️ by guzesqdro 🥳