A high-performing model trained with a new technique called Reflection-tuning that teaches a LLM to detect mistakes in its reasoning and correct course.

70b

96.6K 2 months ago

6a340fffade7 · 297B
{{- range $i, $_ := .Messages }}<|start_header_id|>{{ .Role }}<|end_header_id|>
{{ .Content }}
{{- if ne (len (slice $.Messages $i)) 1 }}<|eot_id|>
{{- else if ne .Role "assistant" }}<|eot_id|>
{{- if eq .Role "user" }}
<|start_header_id|>assistant<|end_header_id|>
{{- end }}
{{- end }}
{{ end }}