DPO finetuned model from FusionNet_7Bx2_MoE_14B
13B
23 Pulls Updated 7 months ago
Updated 7 months ago
7 months ago
2aef458b7638 · 9.1GB
model
archllama
·
parameters12.9B
·
quantizationQ5_K_M
9.1GB
params
{"stop":["[INST]","[/INST]"]}
30B
template
[INST] {{ .System }} {{ .Prompt }} [/INST]
42B
Readme
LHK_DPO_v1 is trained via Direct Preference Optimization(DPO) from https://huggingface.co/TomGrc/FusionNet_7Bx2_MoE_14B.
Original model is from https://huggingface.co/HanNayeoniee/LHK_DPO_v1