1,394 7 months ago

Unsloth's DeepSeek-R1 , I just merged the thing and uploaded it here. This is the full 671b model. MoE Bits:2.22bit Type:UD-IQ2_XXS Disk Size:183GB Accuracy:Better Details:MoE all 2.06bit. down_proj in MoE mixture of 2.5/2.06bit

dc3a163fe619 · 161B
{
"num_gpu": 12,
"stop": [
"<|begin▁of▁sentence|>",
"<|end▁of▁sentence|>",
"<|User|>",
"<|Assistant|>"
]
}