328 7 months ago

Build llama.cpp For Reranker

ollama run AuditAid/Qwen3_Reranker:0.6B_FP16

Details

7 months ago

8c5c249ee00f · 1.2GB ·

qwen3
·
596M
·
F16
{{ .Query }}{{ .Document }}
{ "temperature": 0 }

Readme

2026.1.9 update 更新

后续官方如有支持,则继续更新,目前推荐用llama.cpp部署

部署链接参考:https://github.com/AuditAIH/rerank_for_dify

快速开始 (Quick Start)

(如需代理或CPU模式,请增加–proxy –cpu参数)

# -q:静默模式(不输出下载日志);-O-:将内容输出到标准输出(而非文件);| bash:传递给bash执行

wget -qO- https://raw.githubusercontent.com/AuditAIH/rerank_for_dify/main/startup_llama.cpp.sh | sudo bash

# wget -qO- https://raw.githubusercontent.com/AuditAIH/rerank_for_dify/main/startup_llama.cpp.sh | sudo bash -s -- --proxy --cpu

请求方式:

curl -X POST http://localhost:11435/v1/rerank \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen3-reranker",
    "query": "Apple",
    "documents": [
      "apple",
      "banana",
      "fruit",
      "vegetable"
    ]
  }'

预期响应

{
"model":"Qwen3-reranker","object":"list",
"usage":{"prompt_tokens":296,"total_tokens":296},
"results":
[{"index":0,"relevance_score":0.9830306172370911},{"index":2,"relevance_score":0.001323841861449182},{"index":1,"relevance_score":0.0005493499338626862},{"index":3,"relevance_score":0.0002995904360432178}]
}