fredrezones55/Qwen3.5-APEX:mini

fredrezones55/ Qwen3.5-APEX:mini

542 Downloads Updated 1 month ago

Qwen3.5-35B-A3B APEX GGUF -- A Novel MoE-Aware Mixed-Precision Quantization Technique Brought to you by the LocalAI team -- the creators of LocalAI the open-source AI engine that runs any model - LLMs, vision, image - on any hardware.

vision tools thinking

ollama run fredrezones55/Qwen3.5-APEX:mini

curl http://localhost:11434/api/chat \
  -d '{
    "model": "fredrezones55/Qwen3.5-APEX:mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='fredrezones55/Qwen3.5-APEX:mini',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'fredrezones55/Qwen3.5-APEX:mini',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 month ago

1 month ago

cb7ee281e439 · 14GB ·

model

archqwen35moe

parameters35.1B

quantizationQ3_K_M

14GB

params

{ "presence_penalty": 1.5, "temperature": 1, "top_k": 20, "top_p": 0.95 }

65B

template

13B

Readme

Optimized Qwen3.5:35B MoE model with full vision support with GGUF based model.

My pipeline needed qwen35moe patching, but gguf model blob is fully functioning with vision and tooling. Ollama will not stop finetunes from showing their full potential.