141 Downloads Updated 1 year ago
ollama run neotherack/phi4_tools:14b_q2_K
Updated 1 year ago
1 year ago
b062969d651e · 5.5GB ·
This is my very first quantization and template for Phi4 on ollama. It seems to work on function calling on a low VRAM usage.
I recommend to use it with variables like this: OLLAMA_FLASH_ATTENTION=true OLLAMA_KV_CACHE_TYPE=q8_0
It works kind of good in my old 4Gb VRAM laptop :)