54 Downloads Updated 1 year ago
ollama run did100/phi4_Q8
Updated 1 year ago
1 year ago
d04922f21d98 · 16GB ·
This is Phi4 from Microsoft with a Q8 quantization. The ollama version is Q4_K_M.
This model was built doing the following:
With a ctx-size of 3200, the official model is taking around 10GB of VRAM while this Q8 version takes 16GB.