547 5 months ago

vision tools thinking
ollama run ucx0204/glm-4.6V-Flash-Q8

Applications

Claude Code
Claude Code ollama launch claude --model ucx0204/glm-4.6V-Flash-Q8
Codex App
Codex App ollama launch codex-app --model ucx0204/glm-4.6V-Flash-Q8
OpenClaw
OpenClaw ollama launch openclaw --model ucx0204/glm-4.6V-Flash-Q8
Hermes Agent
Hermes Agent ollama launch hermes --model ucx0204/glm-4.6V-Flash-Q8
Codex
Codex ollama launch codex --model ucx0204/glm-4.6V-Flash-Q8
OpenCode
OpenCode ollama launch opencode --model ucx0204/glm-4.6V-Flash-Q8

Models

View all →

Readme

GLM-4.6V-Flash (Q8_0 GGUF)

This is a GGUF version of the GLM-4.6V-Flash model, quantized to Q8_0 (8-bit) for high-quality inference. It originates from Zhipu AI and was converted/quantized by Unsloth.

🚀 Model Details

  • Original Model: Zhipu AI GLM-4.6V-Flash
  • Quantization: Q8_0 (8-bit) - High quality, balanced memory usage.
  • Format: GGUF (Compatible with Ollama)
  • Capabilities: Multimodal (Vision & Text), Flash attention for speed.
  • Source: unsloth/GLM-4.6V-Flash-GGUF