Experimental model doing a DPO training on top of Kunoichi-DPO-v2-7b, i.e. double-DPO.

230 11 months ago

4ddf52c3a1ac · 146B
{
"stop": [
"<|start_header_id|>",
"<|end_header_id|>",
"<|eot_id|>",
"<|reserved_special_token"
],
"temperature": 0.6
}