74 1 week ago

A Deepseek-R1:8b model with Deepseek-R1:1.5b model as drafting model

thinking
ed8474dc73db · 179B
{
"stop": [
"<|begin▁of▁sentence|>",
"<|end▁of▁sentence|>",
"<|User|>",
"<|Assistant|>"
],
"temperature": 0.6,
"top_p": 0.95
}