143 yesterday

This model was base on unsloth/GLM-4.7-Flash and trained on a small reasoning dataset of Claude Opus 4.5, with reasoning effort set to High.

tools thinking
ollama run SimonPu/GLM-4.7-Flash:Q8

Details

yesterday

212fbbe9b83f ยท 32GB ยท

deepseek2
ยท
29.9B
ยท
Q8_0
[gMASK]<sop>{{ if .System }}<|system|> {{ .System }}{{ end }}{{ if .Prompt }}<|user|> {{ .Prompt }}{
{ "min_p": 0.01, "repeat_penalty": 1, "seed": 3407, "stop": [ "<|user|>"

Readme

GLM-4.7-Flash

๐Ÿ‘‹ Join our Discord community.
๐Ÿ“– Check out the GLM-4.7 technical blog, technical report(GLM-4.5).
๐Ÿ“ Use GLM-4.7-Flash API services on Z.ai API Platform.
๐Ÿ‘‰ One click to GLM-4.7.

Introduction

GLM-4.7-Flash is a 30B-A3B MoE model. As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.

Default Settings (Most Tasks) from Run GLM-4.7-Flash Guide!

  • temperature: 1.0
  • top-p: 0.95
  • min-p: 0.01
  • repeat-penalty: 1.0

Benchmarks

Citation

If you find our work useful in your research, please consider citing the following paper:


@misc{5team2025glm45agenticreasoningcoding,
      title={GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models}, 
      author={GLM Team and Aohan Zeng and Xin Lv and others},
      year={2025},
      eprint={2508.06471},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.06471}, 
}