
-
m3e
Moka-AI Massive Mixed Embedding
embedding5,610 Pulls 7 Tags Updated 12 months ago
-
QwQ-32B-0305
QwQ is the reasoning model of the Qwen series.
tools3,095 Pulls 1 Tag Updated 4 weeks ago
-
gte
General Text Embeddings (GTE) model. Towards General Text Embeddings with Multi-stage Contrastive Learning trained by Alibaba DAMO Academy.
embedding2,002 Pulls 2 Tags Updated 12 months ago
-
deepseek-v3-UD
(Unsloth Dynamic Quants) A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
1,864 Pulls 3 Tags Updated 8 weeks ago
-
dmeta-embedding-zh
Dmeta-embedding is a cross-domain, cross-task, out-of-the-box Chinese embedding model.
embedding1,744 Pulls 2 Tags Updated 12 months ago
-
bilibili-index
由哔哩哔哩自主研发的大语言模型,Index-1.9B 系列是 Index 系列模型中的轻量版本。
804 Pulls 3 Tags Updated 9 months ago
-
reader-lm-v2
ReaderLM-v2 is a 1.5B parameter language model that converts raw HTML into beautifully formatted markdown or JSON with superior accuracy and improved longer context handling.
609 Pulls 3 Tags Updated 2 months ago
-
Simplescaling-S1
s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.
tools524 Pulls 3 Tags Updated 7 weeks ago
-
deepseek-v2.5-1210
DeepSeek-V2.5-1210 is an upgraded version of DeepSeek-V2.5, offering enhanced mathematical, coding, writing, and reasoning capabilities.
278 Pulls 3 Tags Updated 2 months ago
-
rwkv-6-world
RWKV (pronounced RwaKuv) is an RNN with great LLM performance.
274 Pulls 1 Tag Updated 3 months ago
-
Qihoo360-Light-R1-14B-DS
Light-R1-14B-DS is the State-Of-The-Art 14B math model with AIME24 & 25 scores 74.0 & 60.2, outperforming many 32B models.
190 Pulls 1 Tag Updated 2 weeks ago
-
Qihoo360-Light-R1-32B
Light-R1: Surpassing R1-Distill from Scratch* with $1000 through Curriculum SFT & DPO
tools158 Pulls 1 Tag Updated 4 weeks ago
-
deepseek-v2.5-1210-UD
(Unsloth Dynamic Quants) DeepSeek-V2.5-1210 is an upgraded version of DeepSeek-V2.5, offering enhanced mathematical, coding, writing, and reasoning capabilities.
145 Pulls 3 Tags Updated 2 months ago
-
deepseek-r1-UD
(Unsloth Dynamic Quants) DeepSeek's first-generation of reasoning models with comparable performance to OpenAI-o1, 671B MoE full model, not dense distilled models.
123 Pulls 2 Tags Updated 8 weeks ago