169 Downloads Updated 4 days ago
ollama run mannix/omnimerge-v4-mtp
Name
8 models
omnimerge-v4-mtp:latest
17GB · 256K context window · Text · 4 days ago
omnimerge-v4-mtp:IQ3_M
13GB · 256K context window · Text · 4 days ago
omnimerge-v4-mtp:Q4_K_M
17GB · 256K context window · Text · 4 days ago
omnimerge-v4-mtp:Q5_K_M
20GB · 256K context window · Text · 4 days ago
omnimerge-v4-mtp:Q6_K
22GB · 256K context window · Text · 4 days ago
omnimerge-v4-mtp:Q8_0
29GB · 256K context window · Text · 4 days ago
omnimerge-v4-mtp:IQ2_M
10GB · 256K context window · Text · 4 days ago
omnimerge-v4-mtp:IQ4_XS
15GB · 256K context window · Text · 4 days ago
GGUF quantizations of ManniX-ITA/Qwen3.6-27B-Omnimerge-v4 with the MTP (Multi-Token Prediction) head retained for self-speculative decoding on llama.cpp mainline (PR #22673, merged 2026-05-16) and later.
Up to 4x inference speed with 1 session, 2x with 2 in parallel.