209 1 year ago

GLM-Z1-0414-32b thinking model with YaRN RoPE scaling to 128k context

tools