3 1 year ago

a finetuned Llama 3.1 8b model with supplementary DPO training

tools