mannix/eurus-2-7b-prime:IQ3

mannix/ eurus-2-7b-prime:IQ3_XXS

152 Downloads Updated 1 year ago

Eurus-2-7B-PRIME is trained using PRIME (Process Reinforcement through IMplicit rEward) method, an open-source solution for online reinforcement learning (RL) with process rewards, to advance reasoning abilities of language models.

ollama run mannix/eurus-2-7b-prime:IQ3_XXS

curl http://localhost:11434/api/chat \
  -d '{
    "model": "mannix/eurus-2-7b-prime:IQ3_XXS",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='mannix/eurus-2-7b-prime:IQ3_XXS',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'mannix/eurus-2-7b-prime:IQ3_XXS',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 year ago

1 year ago

096bf7f628e0 · 3.1GB ·

model

archqwen2

parameters7.62B

quantizationIQ3_XXS

3.1GB

system

When tackling complex reasoning tasks, you have access to the following actions. Use them as needed

384B

license

Qwen RESEARCH LICENSE AGREEMENT Qwen RESEARCH LICENSE AGREEMENT Release Date: September 19, 2024 By

7.4kB

template

{{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} <|im_start|>{{ .R

255B

Readme

Quantization from fp32
Using i-matrix calibration_datav3.txt

Eurus-2-7B-PRIME is trained using PRIME (Process Reinforcement through IMplicit rEward) method, an open-source solution for online reinforcement learning (RL) with process rewards, to advance reasoning abilities of language models beyond imitation or distillation. It starts with Eurus-2-7B-SFT and trains on Eurus-2-RL-Data.

System Prompt

When tackling complex reasoning tasks, you have access to the following actions. Use them as needed to progress through your thought process.

[ASSESS]

[ADVANCE]

[VERIFY]

[SIMPLIFY]

[SYNTHESIZE]

[PIVOT]

[OUTPUT]

You should strictly follow the format below:

[ACTION NAME]

# Your action step 1

# Your action step 2

# Your action step 3

...

Next action: [NEXT ACTION NAME]

References

Hugging face