50 1 week ago

Voyage AI’s state of the art embedding model in Q8 GGUF for easy use. Important: See README for more info. I did not make this GGUF, credit goes to jsonMartin for the quant. See https://hf.co/jsonMartin/voyage-4-nano-gguf

embedding tools
ollama pull nub235/voyage-4-nano

Applications

Claude Code
Claude Code ollama launch claude --model nub235/voyage-4-nano
Codex
Codex ollama launch codex --model nub235/voyage-4-nano
OpenCode
OpenCode ollama launch opencode --model nub235/voyage-4-nano
OpenClaw
OpenClaw ollama launch openclaw --model nub235/voyage-4-nano

Models

View all →

Readme

Important:

Ollama shows 40K context and quick commands for apps, because this model uses Qwen3 architecture, but it’s actual context is less and it cannot be used in Claude Code or similar apps. I set the default context for this model to 4096 tokens to reduce memory, but this can be manually changed if needed.

Also:

This GGUF outputs embedding a of 1024 dimensions, and not the native 2048 dimensions of the model, because it is missing the linear projection layer at the end. It will still work and perform well, but it should not be dropped into workflows that already expect Voyage 4 embeddings. You can get the linear projection file and code to use it at the orgininal HF repo for this GGUF linked above.