2,029 7 months ago

Unsloth's DeepSeek-R1 , I just merged the thing and uploaded it here. This is the full 671b model. MoE Bits:2.51bit Type:UD-Q2_K_XL Disk Size:212GB Accuracy:Best Details:MoE all 2.5bit. down_proj in MoE mixture of 3.5/2.5bit

bdf941804a53 · 161B
{
"num_gpu": 11,
"stop": [
"<|begin▁of▁sentence|>",
"<|end▁of▁sentence|>",
"<|User|>",
"<|Assistant|>"
]
}