wangrongsheng/sfr-iterative-dpo-llama-3-8b-r

wangrongsheng/ sfr-iterative-dpo-llama-3-8b-r

228 Downloads Updated 1 year ago

SFR-Iterative-DPO-LLaMA-3-8B-R is a further (SFT and RLHF) fine-tuned model on LLaMA-3-8B, which provides good performance. The model is from Salesforce team.

ollama run wangrongsheng/sfr-iterative-dpo-llama-3-8b-r

curl http://localhost:11434/api/chat \
  -d '{
    "model": "wangrongsheng/sfr-iterative-dpo-llama-3-8b-r",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='wangrongsheng/sfr-iterative-dpo-llama-3-8b-r',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'wangrongsheng/sfr-iterative-dpo-llama-3-8b-r',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Models

View all →

Name

1 model

Size

Context

Input

sfr-iterative-dpo-llama-3-8b-r:latest

4.7GB · 8K context window · Text · 1 year ago

sfr-iterative-dpo-llama-3-8b-r:latest

4.7GB

Text

Readme

SFR-Iterative-DPO-LLaMA-3-8B-R is a further (SFT and RLHF) fine-tuned model on LLaMA-3-8B, which provides good performance. The model is from Salesforce team.