208 1 year ago

tools

1 year ago

d4abaa5f66b9 · 8.5GB ·

llama
·
8.03B
·
Q8_0
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|> {{- if .System }} {{ .System }
{ "stop": [ "<|start_header_id|>", "<|end_header_id|>", "<|eot_id|>"

Readme

Llama-3.1-SuperNova-Lite

Overview

Llama-3.1-SuperNova-Lite is an 8B parameter model developed by Arcee.ai, based on the Llama-3.1-8B-Instruct architecture. It is a distilled version of the larger Llama-3.1-405B-Instruct model, leveraging offline logits extracted from the 405B parameter variant. This 8B variation of Llama-3.1-SuperNova maintains high performance while offering exceptional instruction-following capabilities and domain-specific adaptability.

The model was trained using a state-of-the-art distillation pipeline and an instruction dataset generated with EvolKit, ensuring accuracy and efficiency across a wide range of tasks. For more information on its training, visit blog.arcee.ai.

Llama-3.1-SuperNova-Lite excels in both benchmark performance and real-world applications, providing the power of large-scale models in a more compact, efficient form ideal for organizations seeking high performance with reduced resource requirements.

Evaluations

We will be submitting this model to the OpenLLM Leaderboard for a more conclusive benchmark - but here are our internal benchmarks using the main branch of lm evaluation harness:

Benchmark SuperNova-Lite Llama-3.1-8b-Instruct
IF_Eval 81.1 77.4
MMLU Pro 38.7 37.7
TruthfulQA 64.4 55.0
BBH 51.1 50.6
GPQA 31.2 29.02

https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite