NousResearch/Meta-Llama-3.1-8B-Instruct Korean finetuned model with SFT->RLHF->DPO

tools 8b 70b

686 3 months ago

033465c927a9 · 110B
{
"num_keep": 10,
"stop": [
"<|start_header_id|>",
"<|end_header_id|>",
"<|eot_id|>"
]
}