Luzivx / luzivila-model

This repository contains a fine-tuned version of the Meta LLaMA 3.1 model with 8.03 billion parameters. The model is optimized for tasks requiring natural language understanding and generation, with quantization to reduce its size to 4.9GB.

Meta LLaMA 3.1 Family

The LLaMA 3.1 family of models includes the following variants:

8B
70B
405B

The LLaMA 3.1 405B model is the first openly available model that rivals leading AI systems in areas such as general knowledge, steerability, mathematics, tool usage, and multilingual translation. This family of models is highly advanced, with improved performance on state-of-the-art tasks.

Multilingual & Long-Context Support: The updated 8B and 70B models have a significantly longer context length (128K tokens) and improved tool usage. This allows these models to handle advanced use cases like long-form text summarization, multilingual conversational agents, and coding assistants.
License Flexibility: Meta has updated the LLaMA license to allow developers to use the outputs of the LLaMA models, including the 405B model, to improve other models.

Model Evaluations

For the LLaMA 3.1 release, Meta has conducted extensive evaluations on over 150 benchmark datasets across multiple languages. In addition, Meta performed human evaluations comparing LLaMA 3.1 with leading models in real-world scenarios.

Meta’s evaluation results suggest that their flagship 405B model competes effectively with the top AI models, such as GPT-4, GPT-4o, and Claude 3.5 Sonnet. Moreover, even Meta’s smaller models, like the 8B and 70B variants, show competitive performance with both closed and open models of similar size.

Fine-Tuned Model Details

Architecture: LLaMA 3.1
Parameters: 8.03 Billion
Quantization: Q4_K_M (4.9GB size)
Model Size: 4.9 GB
Token Limit: 200 tokens per prompt
Fine-tuned by: Marsa Maroc

Model Template

The model operates with the following template for instruction-response generation:

”`markdown
Below are some instructions that describe tasks. Write responses that appropriately complete each request.