767 Downloads Updated 7 months ago
ollama run zendar79/qwen3:4b-q4_0
Updated 7 months ago
7 months ago
ad253fce0c56 · 2.4GB ·
pip install -U huggingface_hub
huggingface-cli download Qwen/Qwen3-4B-Instruct-2507 \
--local-dir ./Qwen3-4B-Instruct-2507 \
--exclude "*.git*" "README.md" ".gitattributes"
python convert_hf_to_gguf.py ./Qwen3-4B-Instruct-2507 \
--outfile ./qwen3-4b-f16.gguf \
--outtype f16
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
pip install -r requirements.txt
cmake -B build
cmake --build build --config Release
Or you can avoid this step and download the proper release to get the scripts
./llama-quantize ./qwen3-4b-f16.gguf ./qwen3-4b-q4_k_m.gguf q4_k_m
./llama-quantize ./qwen3-4b-f16.gguf ./qwen3-4b-q4_0.gguf q4_0
see this page to see the template format and how to import it on Ollama