latest
4.1GB
Q4.0
7B
2 Pulls Updated 3 months ago
Updated 3 months ago
3 months ago
bd2c46ae824d · 4.1GB
model
archllama
·
parameters7.24B
·
quantizationQ4_0
4.1GB
params
{"temperature":0.1}
20B
Readme
This is not the official prometheus2.0 ollama!
I still testing stuff and need to make sure this is working!
The main reason I make this is because the fp16 version won’t fit on T4 GPU. Hence I need to quantize this model. Which I got from this HuggingFace page