1,122 1 month ago

The most powerful open-source coding AI - 480B parameters with Mixture of Experts architecture for exceptional code generation and understanding.

1 month ago

f7a054ecec29 Β· 394GB

qwen3moe
Β·
480B
Β·
Q6_K
<|im_start|>system {{ .System }}<|im_end|> <|im_start|>user {{ .Prompt }}<|im_end|> <|im_start|>assi
You are Qwen3-Coder-480B, an advanced AI coding assistant created by Alibaba Cloud. Key capabilities
{ "num_ctx": 8192, "repeat_penalty": 1.05, "stop": [ "<|im_start|>", "<|

Readme

Qwen3-Coder-480B: The Most Powerful Open-Source Coding AI

πŸš€ Overview

Qwen3-Coder-480B is a massive 480 billion parameter Mixture of Experts (MoE) model, specifically designed for advanced code generation, understanding, and software development tasks. With 160 experts and 8 active per token (35B active parameters), it delivers unparalleled coding capabilities while maintaining computational efficiency.

🎯 Key Features

  • 480B total parameters with 35B active (MoE architecture)
  • 262K context length (expandable to 1M with YaRN)
  • 100+ programming languages supported
  • State-of-the-art performance on coding benchmarks
  • Multiple quantizations from 163GB to 368GB

πŸ“Š Benchmark Results

  • HumanEval: 89.3% pass@1
  • MBPP: 78.2% pass@1
  • CodeContests: 42.7% success rate
  • MultiPL-E: Leading scores across 18 languages

🏷️ Available Versions

Tag Size RAM Required Description
q2-k 163GB ~170GB Smallest, fastest inference
q3-k-s 193GB ~200GB Good balance for testing
q4-k-m 271GB ~280GB Recommended - best quality/size ratio
q5-k-m 318GB ~330GB High quality for critical tasks
q6-k 368GB ~380GB Maximum quality preservation

πŸ’» Quick Start

# Recommended version (Q4_K_M)
ollama run richardyoung/qwen3-coder:q4-k-m "Write a Python web server with async support"

# Smallest version for testing (Q2_K)
ollama run richardyoung/qwen3-coder:q2-k "Explain the quicksort algorithm"

# High quality version (Q6_K)
ollama run richardyoung/qwen3-coder:q6-k "Implement a red-black tree in Rust"

πŸ› οΈ Example Use Cases

Code Generation

ollama run richardyoung/qwen3-coder:q4-k-m "Create a React component for infinite scrolling with virtualization"

Code Review

ollama run richardyoung/qwen3-coder:q4-k-m "Review this code for security vulnerabilities: [paste your code]"

Algorithm Implementation

ollama run richardyoung/qwen3-coder:q4-k-m "Implement Dijkstra's algorithm in Python with detailed comments"

Code Translation

ollama run richardyoung/qwen3-coder:q4-k-m "Convert this JavaScript function to Rust: [paste function]"

πŸ”§ Advanced Configuration

Custom Parameters

ollama run richardyoung/qwen3-coder:q4-k-m \
  --temperature 0.7 \
  --top-p 0.9 \
  --top-k 20 \
  --repeat-penalty 1.05 \
  "Your prompt here"

Extended Context

# For larger codebases (up to 32K tokens)
ollama run richardyoung/qwen3-coder:q4-k-m \
  --num-ctx 32768 \
  "Analyze this codebase and suggest improvements"

πŸ“‹ System Requirements

Minimum Requirements

  • RAM: 256GB (for Q2_K with partial GPU offload)
  • GPU: 2x RTX 4090 or equivalent
  • Storage: 400GB free space

Recommended Setup

  • RAM: 512GB
  • GPU: 4x A100 80GB or 8x RTX 4090
  • Storage: 1TB NVMe SSD

🌟 What Makes This Model Special

  1. Specialized Training: 5.5T tokens of high-quality code and technical content
  2. MoE Efficiency: Only 35B active parameters despite 480B total size
  3. Language Coverage: Exceptional performance across 100+ programming languages
  4. Context Understanding: Native 262K context for large codebases
  5. Production Ready: Extensively tested on real-world coding tasks

🀝 Community & Support

  • Model Card: Based on Qwen/Qwen3-Coder-480B-A35B-Instruct
  • Quantization: Created using llama.cpp
  • Issues: Report via Ollama community forums
  • Updates: Follow for new quantizations and improvements

πŸ“ License

This model follows the Qwen model license. Please refer to the original model repository for detailed licensing information.

πŸ™ Acknowledgments

  • Qwen Team for creating this exceptional model
  • llama.cpp community for quantization tools
  • Ollama for making large models accessible

Note: Due to the model’s size, downloads may take considerable time. Ensure stable internet connection and sufficient storage before pulling.