nub235/voyage-4-nano

nub235/ voyage-4-nano

50 Downloads Updated 1 week ago

Voyage AI’s state of the art embedding model in Q8 GGUF for easy use. Important: See README for more info. I did not make this GGUF, credit goes to jsonMartin for the quant. See https://hf.co/jsonMartin/voyage-4-nano-gguf

embedding tools

ollama pull nub235/voyage-4-nano

curl http://localhost:11434/api/embed \
  -d '{
    "model": "nub235/voyage-4-nano",
    "input": "Why is the sky blue?"
  }'

import ollama

response = ollama.embed(
    model='nub235/voyage-4-nano',
    input='The sky is blue because of Rayleigh scattering',
)
print(response.embeddings)

import ollama from 'ollama'

const response = await ollama.embed({
  model: 'nub235/voyage-4-nano',
  input: 'The sky is blue because of Rayleigh scattering',
})
console.log(response.embeddings)

Applications

Claude Code ollama launch claude --model nub235/voyage-4-nano

Codex ollama launch codex --model nub235/voyage-4-nano

OpenCode ollama launch opencode --model nub235/voyage-4-nano

OpenClaw ollama launch openclaw --model nub235/voyage-4-nano

Models

View all →

Name

1 model

Size

Context

Input

voyage-4-nano:latest

372MB · 40K context window · Text · 1 week ago

voyage-4-nano:latest

372MB

40K

Text

Readme

Important:

Ollama shows 40K context and quick commands for apps, because this model uses Qwen3 architecture, but it’s actual context is less and it cannot be used in Claude Code or similar apps. I set the default context for this model to 4096 tokens to reduce memory, but this can be manually changed if needed.

Also:

This GGUF outputs embedding a of 1024 dimensions, and not the native 2048 dimensions of the model, because it is missing the linear projection layer at the end. It will still work and perform well, but it should not be dropped into workflows that already expect Voyage 4 embeddings. You can get the linear projection file and code to use it at the orgininal HF repo for this GGUF linked above.