Details

Updated 9 months ago

9 months ago

9d6f5dc4851c · 15GB ·

model

archqwen3

parameters7.57B

quantizationF16

15GB

template

12B

license

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US

11kB

Qwen3-Embedding-8B-f16

F16 精度、8B 参数、32K 上下文；32–4096 维可调嵌入，适用于多语种检索与代码搜索。

简介 | Overview

中文：Qwen3-Embedding-8B-f16 来自 Qwen/Qwen3-Embedding-8B-GGUF 仓库的 F16 精度 8B 文本向量模型，支持 32K 上下文窗口与 32–4096 维可调嵌入，在多语种检索、分类、聚类、跨语匹配与代码检索等场景表现优异。
English: Qwen3-Embedding-8B-f16 is an F16, 8B-parameter text embedding model sourced from Qwen/Qwen3-Embedding-8B-GGUF. It supports a 32K context window and configurable 32–4096-dimensional embeddings for multilingual retrieval, classification, clustering, cross-lingual alignment, and code search.

模型特性 | Key Features

多语种 / Multilingual：覆盖 100+ 语言；据 2025-06-05 的多语 MTEB 榜单为 70.58 分、排名靠前（以官方卡片为准）。
可调维度 / Configurable dims：默认 4096 维，可按需裁剪至 32 维以在质量与成本间取舍。
高保真 / High-fidelity：基于官方 GGUF F16 导出，便于与重排模型或各类向量库无缝集成。
开源许可 / License：Apache-2.0；部署与再分发请保留许可证声明。

使用说明 | Usage

安装并启动 Ollama / Start Ollama daemon
拉取模型 / Pull

   ollama pull l284190056/Qwen3-Embedding-8B-f16

生成嵌入 / Generate embeddings

   ollama embed -m l284190056/Qwen3-Embedding-8B-f16 "测试一下 / Try a quick test"

集成提示 / Integration hints
- 可与 Milvus、Qdrant、Weaviate 等向量数据库结合用于语义检索。
- 与 reranker（重排）模型配合以提升排序质量。

版本信息 | Version Notes

基于 Hugging Face 官方 GGUF F16 文件构建。
Modelfile 中包含模型描述与 Apache-2.0 许可证文本。

参考资源 | References

Blog: https://qwenlm.github.io/blog/qwen3-embedding/
GitHub: https://github.com/QwenLM/Qwen3-Embedding
Hugging Face Model Card: https://huggingface.co/Qwen/Qwen3-Embedding-8B-GGUF

Qwen3-Embedding-8B-f16

F16 precision • 8B parameters • 32K context window • 32–4096 configurable embedding dimensions

Overview

Qwen3-Embedding-8B-f16 is an F16, 8B-parameter text-embedding model sourced from the Qwen/Qwen3-Embedding-8B-GGUF repository. It supports a 32K context window and configurable 32–4096-dimensional embeddings, delivering strong performance for multilingual retrieval, classification, clustering, cross-lingual alignment, and code search.

Key Features

Multilingual coverage across 100+ languages; strong results on the multilingual MTEB leaderboard (see the official model card for details).
Configurable dimensions: default 4096-d, adjustable down to 32-d to balance quality and cost.
High fidelity: built from the official GGUF F16 export for seamless pairing with rerankers and vector databases.
License: Apache-2.0. Retain the license notice for deployments and redistribution.

Usage

Install and start Ollama (local or remote daemon).
Pull the model

   ollama pull l284190056/Qwen3-Embedding-8B-f16

Generate embeddings

   ollama embed -m l284190056/Qwen3-Embedding-8B-f16 "Try a quick test"

Integration hints
- Connect to vector databases such as Milvus, Qdrant, or Weaviate for semantic search.
- Pair with a reranker to improve ranking quality on sensitive queries.

Version Notes

Built from the official Hugging Face GGUF F16 artifact.
The Modelfile includes descriptive metadata and the Apache-2.0 license text.

References

Blog: https://qwenlm.github.io/blog/qwen3-embedding/
GitHub: https://github.com/QwenLM/Qwen3-Embedding
Hugging Face Model Card: https://huggingface.co/Qwen/Qwen3-Embedding-8B-GGUF

from https://huggingface.co/Qwen/Qwen3-Embedding-8B-GGUF

Details

Readme

Qwen3-Embedding-8B-f16

简介 | Overview

模型特性 | Key Features

使用说明 | Usage

版本信息 | Version Notes

参考资源 | References

Qwen3-Embedding-8B-f16

Overview

Key Features

Usage

Version Notes

References