DPO finetuned model from FusionNet_7Bx2_MoE_14B

13B

23 Pulls Updated 7 months ago

Readme

LHK_DPO_v1 is trained via Direct Preference Optimization(DPO) from https://huggingface.co/TomGrc/FusionNet_7Bx2_MoE_14B.

Original model is from https://huggingface.co/HanNayeoniee/LHK_DPO_v1