OxW/Qwen3-0.6B-GGUF

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.

Details

Updated 1 year ago

1 year ago

3e52e1a23a40 · 639MB ·

model

archqwen3

parameters596M

quantizationQ8_0

639MB

license

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US

12kB

params

{ "min_p": 0, "num_ctx": 40960, "num_predict": 32768, "presence_penalty": 1.5, "

270B

template

{{- if .Messages }} {{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{

1.5kB

Source: HF Q8_0

Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios.
Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning.
Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience.
Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks.
Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation.

Model Overview

Qwen3-0.6B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 0.6B - Number of Paramaters (Non-Embedding): 0.44B - Number of Layers: 28 - Number of Attention Heads (GQA): 16 for Q and 8 for KV

Context Length: 32,768.
Quantization: q8_0

For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our blog, GitHub, and Documentation.

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.

Details

Readme

Model Overview