177 1 month ago

Added system prompt to deepseek's new 8B model with Qwen 3, potentially could help, also kept context large as well as temp strict.

tools thinking
a905be7905c0 · 217B
{
"num_ctx": 131072,
"seed": 42,
"stop": [
"<|begin▁of▁sentence|>",
"<|end▁of▁sentence|>",
"<|User|>",
"<|Assistant|>"
],
"temperature": 0.1,
"top_k": 50,
"top_p": 0.95
}