50 Downloads Updated 1 week ago
ollama pull nub235/voyage-4-nano
Important:
Ollama shows 40K context and quick commands for apps, because this model uses Qwen3 architecture, but it’s actual context is less and it cannot be used in Claude Code or similar apps. I set the default context for this model to 4096 tokens to reduce memory, but this can be manually changed if needed.
Also:
This GGUF outputs embedding a of 1024 dimensions, and not the native 2048 dimensions of the model, because it is missing the linear projection layer at the end. It will still work and perform well, but it should not be dropped into workflows that already expect Voyage 4 embeddings. You can get the linear projection file and code to use it at the orgininal HF repo for this GGUF linked above.