latest
4.7GB
SFR-Iterative-DPO-LLaMA-3-8B-R is a further (SFT and RLHF) fine-tuned model on LLaMA-3-8B, which provides good performance. The model is from Salesforce team.
8B
135 Pulls Updated 4 months ago
Updated 4 months ago
4 months ago
df04f1eec67a · 4.7GB
model
archllama
·
parameters8.03B
·
quantizationQ4_0
4.7GB
template
{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>
255B
params
{"num_keep":24,"stop":["<|start_header_id|>","<|end_header_id|>","<|eot_id|>"]}
110B
system
You're a very useful AI assistant.
34B
Readme
SFR-Iterative-DPO-LLaMA-3-8B-R is a further (SFT and RLHF) fine-tuned model on LLaMA-3-8B, which provides good performance. The model is from Salesforce team.