SFR-Iterative-DPO-LLaMA-3-8B-R is a further (SFT and RLHF) fine-tuned model on LLaMA-3-8B, which provides good performance. The model is from Salesforce team.
154 Pulls Updated 6 months ago
1 Tag
df04f1eec67a • 4.7GB •
6 months ago