1,100 Downloads Updated 3 months ago
I’ve uploaded additional quantizations of the deepseek-r1-0528-qwen3:8b, all pulled from Unsloth’s DynamicQuant2.0 page on hf, which offer superior accuracy while maintaining efficiency. The default is the Q4_K_M, but there’s also Q2_K_XL.
DeepSeek-R1-0528-Qwen3-8B represents a significant upgrade to the DeepSeek R1 model series, built on the Qwen3 architecture. This version (0528) delivers enhanced reasoning and inference capabilities through algorithmic optimization and increased computational resources during post-training. The model demonstrates exceptional performance across various benchmark evaluations, with results approaching those of leading models like O3 and Gemini 2.5 Pro.
The DeepSeek-R1-0528 upgrade brings several notable enhancements over previous versions:
For benchmarks requiring sampling, the model uses: - Temperature: 0.6 - Top-p value: 0.95 - 16 responses per query to estimate pass@1
This model is ideal for applications requiring advanced reasoning capabilities, complex problem-solving, and code generation. Its improved performance in mathematical reasoning makes it particularly suitable for educational and research applications.