MichelRosselli/GLM-4.6-REAP-268B-A32B:Q8

MichelRosselli/ GLM-4.6-REAP-268B-A32B:Q8_0

133 Downloads Updated 7 months ago

GLM-4.6-REAP-268B-A32B (by Cerebras), a memory-efficient compressed variant of GLM-4.6 that maintains near-identical performance while being 25% lighter.

tools thinking

ollama run MichelRosselli/GLM-4.6-REAP-268B-A32B:Q8_0

curl http://localhost:11434/api/chat \
  -d '{
    "model": "MichelRosselli/GLM-4.6-REAP-268B-A32B:Q8_0",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='MichelRosselli/GLM-4.6-REAP-268B-A32B:Q8_0',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'MichelRosselli/GLM-4.6-REAP-268B-A32B:Q8_0',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 7 months ago

7 months ago

4232aa9ff0de · 286GB ·

model

archglm4moe

parameters269B

quantizationQ8_0

286GB

template

[gMASK]<sop> {{- if .Tools }}<|system|> # Tools You may call one or more functions to assist with th

1.8kB

params

{ "stop": [ "<|system|>", "<|user|>", "<|assistant|>" ] }

81B

Readme

No readme