846 Downloads Updated 4 months ago
Name
1 model
Size
Context
Input
GLM-4-0414-32b-128k-Q4_K_M:latest
20GB · 32K context window · Text · 4 months ago
20GB
32K
Text
Quantized with YaRN RoPE scaling to 128k context (factor 4). This needs Ollama >=0.6.6 to run. The num_ctx in the Modelfile defaults to 64k just because I don’t have gobs of VRAM.
num_ctx