1,196 8 months ago

tools
ollama run sam860/qwen3-reranker:0.6b-Q8_0

Applications

Claude Code
Claude Code ollama launch claude --model sam860/qwen3-reranker:0.6b-Q8_0
Codex
Codex ollama launch codex --model sam860/qwen3-reranker:0.6b-Q8_0
OpenCode
OpenCode ollama launch opencode --model sam860/qwen3-reranker:0.6b-Q8_0
OpenClaw
OpenClaw ollama launch openclaw --model sam860/qwen3-reranker:0.6b-Q8_0

Models

View all →

Readme

Description:

The Qwen3-Reranker-0.6B is a specialized text reranking model, part of the Qwen3 series, designed to refine the relevance of retrieved text passages. It’s built upon the foundational Qwen3-0.6B-Base model and is particularly effective when combined with an embedding model for a comprehensive retrieval pipeline.

Key Features & Use Cases:

  • Model Type: A Text Reranking model, specifically designed to reorder initial search results based on relevance. It complements text embedding models within a broader information retrieval system.
  • Size: A compact model with 0.6 billion parameters, striking a balance between performance and computational efficiency. Larger reranking models (4B, 8B) are also available in the series for more demanding tasks.
  • Multilingual Capability: Supports over 100 languages, including programming languages, inheriting the robust multilingual and cross-lingual understanding from its foundational Qwen3 series.
  • Context Length: Capable of processing texts up to 32,000 tokens, allowing for reranking of longer documents or contexts.
  • Instruction Awareness: Supports user-defined instructions to fine-tune its reranking behavior for specific tasks, languages, or scenarios. This feature can improve performance in retrieval scenarios by 1% to 5%.
  • Applications: Primarily used for:
    • Text Reranking: Enhancing the accuracy of search results by re-scoring and reordering retrieved documents.
    • Text Retrieval: As a crucial second stage after initial retrieval by an embedding model, to improve the relevance of the final output.
    • Code Retrieval: Refining search results for code snippets.

Usage Notes:

  • For optimal results, it’s highly recommended to customize the input instruction (instruct) based on your specific use case, task, and language.
  • The model integrates seamlessly with the Hugging Face Transformers library, and vLLM for efficient inference.

HuggingFace