starling-lm:7b-beta-q5

starling-lm:7b-beta-q5_1

305.6K Downloads Updated 1 year ago

Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.

ollama run starling-lm:7b-beta-q5_1

curl http://localhost:11434/api/chat \
  -d '{
    "model": "starling-lm:7b-beta-q5_1",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='starling-lm:7b-beta-q5_1',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'starling-lm:7b-beta-q5_1',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 year ago

1 year ago

b17e67a4be34 · 5.4GB ·

model

archllama

parameters7.24B

quantizationQ5_1

5.4GB

template

{{ if .System }}GPT4 Correct System: {{ .System }}<|end_of_turn|>{{ end }}{{ if .Prompt }}GPT4 Corre

200B

license

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US

11kB

params

{ "stop": [ "<|endoftext|>", "<|end_of_turn|>", "Human:", "Assis

87B

Readme

Starling-7B is an open (non-commercial) large language model (LLM) trained by reinforcement learning from AI feedback. (RLAIF)

The model harnesses the power of our new GPT-4 labeled ranking dataset, Nectar, and our new reward training and policy tuning pipeline. Starling-7B-alpha scores 8.09 in MT Bench with GPT-4 as a judge, outperforming every model to date on MT-Bench except for OpenAI’s GPT-4 and GPT-4 Turbo.

*Based on MT Bench evaluations, using GPT-4 scoring. Further human evaluation is needed.

Authors: Banghua Zhu, Evan Frick, Tianhao Wu, Hanlin Zhu and Jiantao Jiao.

For correspondence, please contact Banghua Zhu (banghua@berkeley.edu).

Reference

Starling-7B: Increasing LLM Helpfulness & Harmlessness with RLAIF

HuggingFace