63 2 months ago

reduced s5 default context to 4096 tokens to allow better local inference thinks in chinese somtimes??

9744aa0ee273 · 202B
{
"num_ctx": 4096,
"num_gpu": 1,
"num_thread": 8,
"repeat_penalty": 1.2,
"stop": [
"<|user|>",
"<|assistant|>",
"<|system|>",
"<|observation|>"
],
"temperature": 0.5,
"top_p": 0.8
}