Ollama
Models GitHub Discord Docs Cloud
Sign in Download
Models Download GitHub Discord Docs Cloud Sign in
⇅
glm · Ollama Search
Search for models on Ollama.
  • glm4

    A strong multi-lingual general language model with competitive performance to Llama 3.

    9b

    190.8K  Pulls 32  Tags Updated  1 year ago

  • glm-4.6

    Advanced agentic, reasoning and coding capabilities.

    cloud

    34.1K  Pulls 1  Tag Updated  2 months ago

  • gemma

    Gemma is a family of lightweight, state-of-the-art open models built by Google DeepMind. Updated to version 1.1

    2b 7b

    5.6M  Pulls 102  Tags Updated  1 year ago

  • granite3.1-moe

    The IBM Granite 1B and 3B models are long-context mixture of experts (MoE) Granite models from IBM designed for low latency usage.

    tools 1b 3b

    1.5M  Pulls 33  Tags Updated  11 months ago

  • granite3.3

    IBM Granite 2B and 8B models are 128K context length language models that have been fine-tuned for improved reasoning and instruction-following capabilities.

    tools 2b 8b

    773K  Pulls 3  Tags Updated  8 months ago

  • granite3.2-vision

    A compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.

    vision tools 2b

    559.8K  Pulls 5  Tags Updated  9 months ago

  • granite3.2

    Granite-3.2 is a family of long-context AI models from IBM Granite fine-tuned for thinking capabilities.

    tools 2b 8b

    188.6K  Pulls 9  Tags Updated  9 months ago

  • granite3.1-dense

    The IBM Granite 2B and 8B models are text-only dense LLMs trained on over 12 trillion tokens of data, demonstrated significant improvements over their predecessors in performance and speed in IBM’s initial testing.

    tools 2b 8b

    150.4K  Pulls 33  Tags Updated  11 months ago

  • goliath

    A language model created by combining two fine-tuned Llama 2 70B models into one.

    55.5K  Pulls 16  Tags Updated  2 years ago

  • gemma3

    The current, most capable model that runs on a single GPU.

    vision cloud 270m 1b 4b 12b 27b

    28.4M  Pulls 29  Tags Updated  1 week ago

  • qwen3

    Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.

    tools thinking 0.6b 1.7b 4b 8b 14b 30b 32b 235b

    15.2M  Pulls 58  Tags Updated  2 months ago

  • gemma2

    Google Gemma 2 is a high-performing and efficient model available in three sizes: 2B, 9B, and 27B.

    2b 9b 27b

    11.5M  Pulls 94  Tags Updated  1 year ago

  • codegemma

    CodeGemma is a collection of powerful, lightweight models that can perform a variety of coding tasks like fill-in-the-middle code completion, code generation, natural language understanding, mathematical reasoning, and instruction following.

    2b 7b

    1.7M  Pulls 85  Tags Updated  1 year ago

  • starcoder2

    StarCoder2 is the next generation of transparently trained open code LLMs that comes in three sizes: 3B, 7B and 15B parameters.

    3b 7b 15b

    1.6M  Pulls 67  Tags Updated  1 year ago

  • granite-code

    A family of open foundation models by IBM for Code Intelligence

    3b 8b 20b 34b

    393.4K  Pulls 162  Tags Updated  1 year ago

  • granite4

    Granite 4 features improved instruction following (IF) and tool-calling capabilities, making them more effective in enterprise applications.

    tools 350m 1b 3b

    295.4K  Pulls 17  Tags Updated  1 month ago

  • granite-embedding

    The IBM Granite Embedding 30M and 278M models models are text-only dense biencoder embedding models, with 30M available in English only and 278M serving multilingual use cases.

    embedding 30m 278m

    143.7K  Pulls 6  Tags Updated  1 year ago

  • llama-guard3

    Llama Guard 3 is a series of models fine-tuned for content safety classification of LLM inputs and responses.

    1b 8b

    119.6K  Pulls 33  Tags Updated  1 year ago

  • granite3-moe

    The IBM Granite 1B and 3B models are the first mixture of experts (MoE) Granite models from IBM designed for low latency usage.

    tools 1b 3b

    114K  Pulls 33  Tags Updated  1 year ago

  • gemini-3-pro-preview

    Google's most intelligent model with SOTA reasoning and multimodal understanding, and powerful agentic and vibe coding capabilities.

    cloud

    37.9K  Pulls 1  Tag Updated  1 month ago

© 2025 Ollama
Download Blog Docs GitHub Discord X (Twitter) Contact Us
  • Blog
  • Download
  • Docs
  • GitHub
  • Discord
  • X (Twitter)
  • Meetups
© 2025 Ollama Inc.