351 2 months ago

A model based on the GLM-4.6v-flash:9b q5_k_m, and uncensored. For local use I recommend editing the model context in modelfile as it is set to 128k. #EDIT: New local optimised model same with context 4096 https://ollama.com/ShreyanGondaliya/s5-reduced

7daaff2959ce · 204B
{
"num_ctx": 128000,
"num_gpu": 1,
"num_thread": 8,
"repeat_penalty": 1.2,
"stop": [
"<|user|>",
"<|assistant|>",
"<|system|>",
"<|observation|>"
],
"temperature": 0.5,
"top_p": 0.8
}