542 1 month ago

Qwen3.5-35B-A3B APEX GGUF -- A Novel MoE-Aware Mixed-Precision Quantization Technique Brought to you by the LocalAI team -- the creators of LocalAI the open-source AI engine that runs any model - LLMs, vision, image - on any hardware.

vision tools thinking
ollama run fredrezones55/Qwen3.5-APEX:mini

Details

1 month ago

cb7ee281e439 · 14GB ·

qwen35moe
·
35.1B
·
Q3_K_M
{ "presence_penalty": 1.5, "temperature": 1, "top_k": 20, "top_p": 0.95 }
{{ .Prompt }}

Readme

Optimized Qwen3.5:35B MoE model with full vision support with GGUF based model.

My pipeline needed qwen35moe patching, but gguf model blob is fully functioning with vision and tooling. Ollama will not stop finetunes from showing their full potential.