2,025 1 year ago

NousResearch/Meta-Llama-3.1-8B-Instruct Korean finetuned model with SFT->RLHF->DPO

tools 8b 70b