44 1 month ago

This is not the ablation version. DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode.

tools thinking 671b
077ea6923937 · 191B
{
"num_gpu": 1,
"stop": [
"<|begin▁of▁sentence|>",
"<|end▁of▁sentence|>",
"<|User|>",
"<|Assistant|>"
],
"temperature": 0.6,
"top_p": 0.95
}