mannix/eurus-2-7b-prime:Q5_K_S/template

mannix/ eurus-2-7b-prime:Q5_K_S

145 Downloads Updated 1 year ago

Eurus-2-7B-PRIME is trained using PRIME (Process Reinforcement through IMplicit rEward) method, an open-source solution for online reinforcement learning (RL) with process rewards, to advance reasoning abilities of language models.

eurus-2-7b-prime:Q5_K_S ... /

template

4b01fbe300da · 255B

{{- range $i, $_ := .Messages }}

{{- $last := eq (len (slice $.Messages $i)) 1 -}}

<|im_start|>{{ .Role }}

{{ .Content }}{{ if not $last }}<|im_end|>

{{ end }}

{{- if and (ne .Role "assistant") $last }}<|im_end|>

<|im_start|>assistant

{{ end }}

{{- end }}