798 Downloads Updated 6 months ago
I’ve uploaded the quants of these models that I find most useful. I quantized them myself using the fp16 GGUF files provided by mradermacher. The default temperature has been adjusted - as I find the smaller qwen models tends to hallucinate too much at higher temps.
For 1.5b:
Temperature set to 0.2
Q4_K_M to me is the lower limit of usable
Q5_K_M is the default, and a fine middle ground with perfectly acceptable quality to me
Q8_0 has no compromises on accuracy or performance; I don’t see a reason to use a larger un-quantized fp model
For 3b:
Temperature set to 0.5
Q4_0 is as low as I’d go for quality concerns
Q4_K_M is the default, and a fine middle ground with perfectly acceptable quality to me
Q5_K_M is going to be just about perfect in my experience with this model
Q8_0 has no compromises on accuracy or performance; I don’t see a reason to use a larger un-quantized fp model
The ultimate general-purpose local AI model, enabling coding, math, and more.
Created by the cognitivecomputations team.