Ollama
Models Docs Pricing
Sign in Download
Models Download Docs Pricing Sign in
⇅
glm-4.6 · Ollama
Search for models on Ollama.
  • sparksammy/glm-4.6v-flash-unsloth

    vision tools thinking

    633  Pulls 5  Tags Updated  4 months ago

  • ucx0204/glm-4.6V-Flash-Q8

    vision tools thinking

    547  Pulls 1  Tag Updated  5 months ago

  • doitmagic/glm-4.6v-flash

    508  Pulls 1  Tag Updated  6 months ago

  • zerocopia/glm-4.6

    tools thinking cloud

    3  Pulls 1  Tag Updated  1 month ago

  • haervwe/GLM-4.6V-Flash-9B

    GLM 4.6V Flash 9B model with vision, tools, and hybrid thinking enabled. using custom template to align it to ollama and the recomended sampling settigns by default. using unsloth quants at q4K_M

    vision tools thinking

    3,798  Pulls 1  Tag Updated  5 months ago

  • ShreyanGondaliya/s5

    A model based on the GLM-4.6v-flash:9b q5_k_m, and uncensored. For local use I recommend editing the model context in modelfile as it is set to 128k. #EDIT: New local optimised model same with context 4096 https://ollama.com/ShreyanGondaliya/s5-reduced

    415  Pulls 1  Tag Updated  4 months ago

  • MedAIBase/GLM-4.6V-Flash

    GLM-4.6V-Flash (9B) is a lightweight model optimized for local deployment and low-latency applications. It scales its context window to 128k tokens in training and achieves SoTA performance in visual understanding among models of similar parameter scales.

    9b

    366  Pulls 3  Tags Updated  4 months ago

  • rnogy/GLM-4.6V

    unsloth/GLM-4.6V 106b

    151  Pulls 1  Tag Updated  5 months ago

  • alibilge/Huihui-GLM-4.6V-Flash-abliterated

    Abliterated (Uncensored) GLM4.6 Flash

    7,155  Pulls 11  Tags Updated  6 months ago

  • MichelRosselli/GLM-4.6

    GLM-4.6 is a hybrid reasoning model that provides two modes: a thinking mode for complex reasoning and tool use, and a non-thinking mode for immediate responses.

    tools thinking

    5,701  Pulls 9  Tags Updated  6 months ago

  • gurubot/GLM-4.6V-Flash-GGUF

    tools thinking

    1,114  Pulls 1  Tag Updated  6 months ago

  • aiasistentworld/GLM-4.6-LLM

    New version GLM-4.6

    430  Pulls 1  Tag Updated  8 months ago

  • scorpion7slayer/GLM-4.6V-Flash

    model imported from hf

    239  Pulls 1  Tag Updated  6 months ago

  • MichelRosselli/GLM-4.6-REAP-218B-A32B-FP8-mixed-AutoRound

    This model is a mixed gguf q2ks format of Cerebras' GLM-4.6-REAP-218B-A32B-FP8 generated using Intel's AutoRound algorithm.

    tools thinking

    159  Pulls 1  Tag Updated  7 months ago

  • MichelRosselli/GLM-4.6-REAP-268B-A32B

    GLM-4.6-REAP-268B-A32B (by Cerebras), a memory-efficient compressed variant of GLM-4.6 that maintains near-identical performance while being 25% lighter.

    tools thinking

    131  Pulls 9  Tags Updated  6 months ago

  • JollyLlama/GLM-4-32B-0414-Q4_K_M

    This model requires Ollama v0.6.6 or later

    5,816  Pulls 1  Tag Updated  1 year ago

  • rhundt/GLM-4-0414-32b-128k-Q4_K_M

    GLM-4-0414 32B with 128k context (YaRN RoPE scaling). Needs ollama 0.6.6

    tools

    1,077  Pulls 1  Tag Updated  1 year ago

  • coney_/gpt-oss_claude-sonnet4.6

    gpt-oss_claude-sonnet4.6 is the GPT-OSS model running on the Claude Sonnet 4.6 system prompt, combining GPT-OSS's open-source foundation with Claude Sonnet 4.6's advanced instructions and behavior.

    tools thinking

    7,576  Pulls 1  Tag Updated  3 months ago

  • coney_/gemma3_claude-sonnet4.6

    gemma3_claude-sonnet4.6 is the Gemma 3 model running on the Claude Sonnet 4.6 system prompt, combining Gemma 3's open-source foundation with Claude Sonnet 4.6's instructions and behavior.

    vision

    915  Pulls 1  Tag Updated  3 months ago

  • ShreyanGondaliya/gemma-4-claude-opus-4.6-thinking-s7-multimodal

    Gemma 4 distilled from claude opus 4.6 thinking. Has only a 5% gap with claude opus 4.6 thinking while being over 40x smaller. Designed for server inference. Designed for local inference

    tools thinking

    560  Pulls 1  Tag Updated  2 months ago

© 2026 Ollama
Blog Contact