102 Downloads Updated 4 months ago
参考build指南:https://github.com/ollama/ollama/pull/11389#issuecomment-3089786702
参考千问官方模型说明:https://qwenlm.github.io/zh/blog/qwen3-embedding/
对比一个月前的版本,此版本已经支持bge-reranker-v2-m3大模型,下载新的地址,和一代的Modelfile文件已经做了升级,并不再兼容。 https://github.com/AuditAIH/ollama-rerank/releases/download/0.2/ollama_rerank_v2.tar
git clone https://github.com/sinjab/ollama.git
cd ollama
git checkout reranking-implementation
go build .
OLLAMA_NEW_ENGINE=1 ./ollama serve
git clone https://github.com/sinjab/ollama.git
cd ollama
git checkout reranking-implementation
cmake -B build
cmake --build build
go build .
export OLLAMA_HOST=0.0.0.0:11436
export OLLAMA_MODELS=/usr/share/ollama/.ollama/models/
OLLAMA_NEW_ENGINE=1 ./ollama serve
(如果需要迁移到别的内网环境,lib文件在../lib/ollama (Linux),请按照结构,构造一个文件夹,里面存放./ollama和./lib/ollama/*.so,lib的文件在输出目录../lib/ollama (Linux)中,以及ldd ./build/lib/ollama/libggml-cuda.so的链接中 )
最终长这个样子才能正确运行cuda。
####我已经编译成功的文件如下,自行编译需要把cuda的四个依赖文件放进来。 wget https://github.com/AuditAIH/ollama-rerank/releases/download/0.2/ollama_rerank_v2.tar
tar -xzvf ollama_rerank_v2.tar
cd ollama-rerank
您可以从Hugging Face或Ollama官方仓库获取模型:
####下载其他模型:ollama pull hf.co/mradermacher/Qwen3-Reranker-8B-GGUF:F16 ####修改template文件,参考我的4B的模板。
FROM ./Qwen3-Reranker-4B.Q5_K_M.gguf
TEMPLATE """<|im_start|>system
Judge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>
<|im_start|>user
<Instruct>: Please judge relevance.
<Query>: {{ .Query }}
<Document>: {{ .Document }}<|im_end|>
<|im_start|>assistant
<think>
</think>
"""
我的bge-reranker-v2-m3的Modelfile文件模板:
FROM ./bge-reranker-v2-m3-FP16.gguf
TEMPLATE """Query: {{ .Query }}
Document: {{ .Document }}
Relevance: """
PARAMETER temperature 0
参考地址:https://huggingface.co/mradermacher/Qwen3-Reranker-8B-GGUF/tree/main?local-app=ollama
建议在独立端口上运行Ollama服务:# 设置服务地址和端口
export OLLAMA_HOST=0.0.0.0:11436
OLLAMA_NEW_ENGINE=1 ./ollama serve &
####快速拉取模型 ./ollama pull AuditAid/Reranker_v2:Qwen3-Reranker-4B_Q5_K_M
使用以下命令测试服务是否正常工作:curl -X POST http://localhost:11436/api/rerank
-H “Content-Type: application/json”
-d ‘{
“model”: “AuditAid/Reranker_v2:Qwen3-Reranker-4B_Q5_K_M”,
“query”: “What is machine learning?”,
“documents”: [
“Machine learning is a subset of artificial intelligence”,
“The weather today is sunny and warm”
]
}’
chmod +x ollama赋予执行权限OLLAMA_HOST中的端口号