174 Downloads Updated 1 week ago
ollama run MrScratchcat22/GLM-4.7-Flash-REAP-23B-A3B
Updated 1 week ago
1 week ago
fac1e5dddd39 · 14GB ·
Introducing GLM-4.7-Flash-REAP-23B-A3B, a memory-efficient compressed variant of GLM-4.7-Flash that maintains near-identical performance while being 25% lighter.
This model was created using REAP (Router-weighted Expert Activation Pruning), a novel expert pruning method that selectively removes redundant experts while preserving the router’s independent control over remaining experts. Key features include: