63 2 months ago

reduced s5 default context to 4096 tokens to allow better local inference thinks in chinese somtimes??

ollama run ShreyanGondaliya/s5-reduced

Details

2 months ago

5e0d288f2aa5 · 7.1GB ·

glm4
·
9.4B
·
Q5_K_M
[gMASK]<sop>{{ if .System }}<|system|> {{ .System }}{{ end }}{{ if .Prompt }}<|user|> {{ .Prompt }}{
You are an intelligent, advanced AI assistant with enhanced reasoning capabilities. You provide accu
{ "num_ctx": 4096, "num_gpu": 1, "num_thread": 8, "repeat_penalty": 1.2, "stop":

Readme

No readme