31 13 hours ago

Quantized Mistral 7B Instruct models optimized for fast, CPU-only local inference with Ollama. Multiple variants balancing speed, quality, and memory efficiency.

6b7a7843ddd2 · 216B
{
"num_batch": 512,
"num_ctx": 768,
"num_gpu": 0,
"num_predict": 200,
"num_thread": 8,
"repeat_penalty": 1.15,
"stop": [
"<|im_start|>",
"<|im_end|>",
"</s>"
],
"temperature": 0.75,
"top_k": 40,
"top_p": 0.95
}