Megrez-3B-Instruct is a large language model trained by Infinigence AI. Megrez-3B aims to provide a fast inference, compact, and powerful edge-side intelligent solution through software-hardware co-design.

Megrez-3B: The integration of software and hardware unleashes the potential of edge intelligence

Introduction

High Accuracy: Megrez-3B successfully compresses the capabilities of the previous 14 billion model into a 3 billion size, and achieves excellent performance on mainstream benchmarks.
High Speed: A smaller model does not necessarily bring faster speed. Megrez-3B ensures a high degree of compatibility with mainstream hardware through software-hardware co-design, leading an inference speedup up to 300% compared to previous models of the same accuracy.
Easy to Use: In the beginning, we had a debate about model design: should we design a unique but efficient model structure, or use a classic structure for ease of use? We chose the latter and adopt the most primitive LLaMA structure, which allows developers to deploy the model on various platforms without any modifications and minimize the complexity of future development.
Rich Applications: We have provided a fullstack WebSearch solution. Our model is functionally trained on web search tasks, enabling it to automatically determine the timing of search invocations and provide better summarization results. The complete deployment code is released on github.

Model Card

Model name: Megrez-3B-Instruct
Architecture: Llama-2 with GQA
Context length: 32K tokens
Params (Total): 2.92B
Params (Backbone only, w/o Emb or Softmax): 2.29B
Vocab Size: 122880
Training data: 3T tokens
Supported languages: Chinese & English

Megrez-3B-Instruct is a large language model trained by Infinigence AI. Megrez-3B aims to provide a fast inference, compact, and powerful edge-side intelligent solution through software-hardware co-design.

Models

Readme

Megrez-3B: The integration of software and hardware unleashes the potential of edge intelligence

Introduction

Model Card

hugging face:

https://huggingface.co/Infinigence/Megrez-3B-Instruct/blob/main/README_EN.md