starling-lm

927.7K Downloads Updated 2 years ago

Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.

ollama run starling-lm

curl http://localhost:11434/api/chat \
  -d '{
    "model": "starling-lm",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='starling-lm',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'starling-lm',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Models

View all →

Name

36 models

Size / Usage

Context

Input

starling-lm:latest

4.1GB · 8K context window · Text · 2 years ago

starling-lm:latest

4.1GB

Text

starling-lm:7b

latest

4.1GB · 8K context window · Text · 2 years ago

starling-lm:7b latest

4.1GB

Text

Readme

Starling-7B is an open (non-commercial) large language model (LLM) trained by reinforcement learning from AI feedback. (RLAIF)

The model harnesses the power of our new GPT-4 labeled ranking dataset, Nectar, and our new reward training and policy tuning pipeline. Starling-7B-alpha scores 8.09 in MT Bench with GPT-4 as a judge, outperforming every model to date on MT-Bench except for OpenAI’s GPT-4 and GPT-4 Turbo.

*Based on MT Bench evaluations, using GPT-4 scoring. Further human evaluation is needed.

Authors: Banghua Zhu, Evan Frick, Tianhao Wu, Hanlin Zhu and Jiantao Jiao.

For correspondence, please contact Banghua Zhu (banghua@berkeley.edu).

Reference

Starling-7B: Increasing LLM Helpfulness & Harmlessness with RLAIF

HuggingFace