141 3 months ago

A 4B-parameter, 8-bit quantized inference model with /think (reasoning) and /no_think (fast response) modes.

tools thinking
cff3f395ef37 · 120B
{
"repeat_penalty": 1,
"stop": [
"<|im_start|>",
"<|im_end|>"
],
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95
}