27 3 weeks ago

Using 4096 tokens for flash attention context window to work as intended. Trying a new template and system prompt to see how it reacts.

tools
c648f7acaee5 · 185B
{
"num_ctx": 4096,
"repeat_penalty": 1.15,
"seed": 42,
"stop": [
"<|start_header_id|>",
"<|end_header_id|>",
"<|eot_id|>"
],
"temperature": 0.1,
"top_k": 50,
"top_p": 0.95
}