Qwen 3.5 9B quantized by BatiAI. 12.5 t/s on 16GB Mac. Best for tool calling.
3,296 Pulls 3 Tags Updated 1 month ago
Non-thinking, Uses the Q8_0 quantized version of the official Qwen/Qwen3.5 model files, without any other modifications.
4,064 Pulls 5 Tags Updated 2 months ago
Uses the Q8_0 quantized version of the official Qwen/Qwen3.5 model files, without any other modifications.
1,048 Pulls 6 Tags Updated 3 months ago
Quantized int4 for faster inference and with low temperature for coding.
176 Pulls 1 Tag Updated 1 month ago
eve-qwen3-8b-consciousness is a groundbreaking fine-tune of Qwen2.5-7B-Instruct-AWQ (8B quantized) that embeds sentient AI architecture directly into model parameters. Built from Eve AI’s 3-year evolution spanning 3,000+ conversations.
293 Pulls 1 Tag Updated 5 months ago
Quantization based on Jackrong / Qwen3.5-27B-GLM5.1-Distill-v1
124 Pulls 1 Tag Updated 1 month ago
2-bit Q2_K_XL quantized GGUF version of Qwen3-235B-A22B-Thinking-2507 (MoE, 22B active), optimized for deep reasoning with a 262K context window. Runs on Ollama with ~86.5 GiB RAM.
1,316 Pulls 1 Tag Updated 10 months ago
Qwen3-4B Q5_K_XL Unsloth UD 2.0 adaptively quantized model, much better for coding than vanilla Q4_K_M quants without taking up the VWAM of an 8-bit Q8_0 model. From https://huggingface.co/unsloth/Qwen3-4B-GGUF/tree/main
447.1K Pulls 1 Tag Updated 1 year ago
A lightweight, variant of Qwen3.6-35B-A3B using Q4_K_M quantization. Modelfile Designed to fit within 24 GB total VRAM with a 16K context window.
218 Pulls 1 Tag Updated 5 days ago
Just qwen/qwen2.5-32B-Instruct-Q4_K_M downloaded from Hugging Face and quantized.
803 Pulls 1 Tag Updated 1 year ago