minicpm3_4b-ggml-model-Q4_K_M.gguf

340 2 months ago

Readme

openbmb/MiniCPM3-4B

MiniCPM Repo | MiniCPM Paper |

Introduction

MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, being comparable with many recent 7B~9B models.

Compared to MiniCPM1.0/MiniCPM2.0, MiniCPM3-4B has a more powerful and versatile skill set to enable more general usage. MiniCPM3-4B supports function call, along with code interpreter. Please refer to Advanced Features for usage guidelines.

MiniCPM3-4B has a 32k context window. Equipped with LLMxMapReduce, MiniCPM3-4B can handle infinite context theoretically, without requiring huge amount of memory.

Evaluation Results

Benchmark Qwen2-7B-Instruct GLM-4-9B-Chat Gemma2-9B-it Llama3.1-8B-Instruct GPT-3.5-Turbo-0125 Phi-3.5-mini-Instruct(3.8B) MiniCPM3-4B
English
MMLU 70.5 72.4 72.6 69.4 69.2 68.4 67.2
BBH 64.9 76.3 65.2 67.8 70.3 68.6 70.2
MT-Bench 8.41 8.35 7.88 8.28 8.17 8.60 8.41
IFEVAL (Prompt Strict-Acc.) 51.0 64.5 71.9 71.5 58.8 49.4 68.4
Chinese
CMMLU 80.9 71.5 59.5 55.8 54.5 46.9 73.3
CEVAL 77.2 75.6 56.7 55.2 52.8 46.1 73.6
AlignBench v1.1 7.10 6.61 7.10 5.68 5.82 5.73 6.74
FollowBench-zh (SSR) 63.0 56.4 57.0 50.6 64.6 58.1 66.8
Math
MATH 49.6 50.6 46.0 51.9 41.8 46.4 46.6
GSM8K 82.3 79.6 79.7 84.5 76.4 82.7 81.1
MathBench 63.4 59.4 45.8 54.3 48.9 54.9 65.6
Code
HumanEval+ 70.1 67.1 61.6 62.8 66.5 68.9 68.3
MBPP+ 57.1 62.2 64.3 55.3 71.4 55.8 63.2
LiveCodeBench v3 22.2 20.2 19.2 20.4 24.0 19.6 22.6
Function Call
BFCL v2 71.6 70.1 19.2 73.3 75.4 48.4 76.0
Overall
Average 65.3 65.0 57.9 60.8 61.0 57.2 66.3

Statement

  • As a language model, MiniCPM3-4B generates content by learning from a vast amount of text.
  • However, it does not possess the ability to comprehend or express personal opinions or value judgments.
  • Any content generated by MiniCPM3-4B does not represent the viewpoints or positions of the model developers.
  • Therefore, when using content generated by MiniCPM3-4B, users should take full responsibility for evaluating and verifying it on their own.

LICENSE

  • This repository is released under the Apache-2.0 License.
  • The usage of MiniCPM3-4B model weights must strictly follow MiniCPM Model License.md.
  • The models and weights of MiniCPM3-4B are completely free for academic research. after filling out a “questionnaire” for registration, are also available for free commercial use.

Citation

@article{hu2024minicpm,
  title={MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies},
  author={Hu, Shengding and Tu, Yuge and Han, Xu and He, Chaoqun and Cui, Ganqu and Long, Xiang and Zheng, Zhi and Fang, Yewei and Huang, Yuxiang and Zhao, Weilin and others},
  journal={arXiv preprint arXiv:2404.06395},
  year={2024}
}