23 1 year ago

DPO finetuned model from FusionNet_7Bx2_MoE_14B

Models

View all →

Readme

LHK_DPO_v1 is trained via Direct Preference Optimization(DPO) from https://huggingface.co/TomGrc/FusionNet_7Bx2_MoE_14B.

Original model is from https://huggingface.co/HanNayeoniee/LHK_DPO_v1