QwQ is an experimental research model focused on advancing AI reasoning capabilities.

tools 32b

2,133 12 hours ago

Readme

QwQ is a 32B parameter experimental research model developed by the Qwen Team, focused on advancing AI reasoning capabilities.

image.png

image.png

QwQ demonstrates remarkable performance across these benchmarks:

  • 65.2% on GPQA, showcasing its graduate-level scientific reasoning capabilities
  • 50.0% on AIME, highlighting its strong mathematical problem-solving skills
  • 90.6% on MATH-500, demonstrating exceptional mathematical comprehension across diverse topics
  • 50.0% on LiveCodeBench, validating its robust programming abilities in real-world scenarios.

These results underscore QwQ’s significant advancement in analytical and problem-solving capabilities, particularly in technical domains requiring deep reasoning.

As a preview release, it demonstrates promising analytical abilities while having several important limitations:

  1. Language Mixing and Code-Switching: The model may mix languages or switch between them unexpectedly, affecting response clarity.

  2. Recursive Reasoning Loops: The model may enter circular reasoning patterns, leading to lengthy responses without a conclusive answer.

  3. Safety and Ethical Considerations: The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it.

  4. Performance and Benchmark Limitations: The model excels in math and coding but has room for improvement in other areas, such as common sense reasoning and nuanced language understanding.