Q4.0

7B

2 Pulls Updated 3 months ago

Readme

This is not the official prometheus2.0 ollama!

I still testing stuff and need to make sure this is working!

The main reason I make this is because the fp16 version won’t fit on T4 GPU. Hence I need to quantize this model. Which I got from this HuggingFace page