
-
m3e
Moka-AI Massive Mixed Embedding
embedding6,618 Pulls 7 Tags Updated 1 year ago
-
QwQ-32B-0305
QwQ is the reasoning model of the Qwen series.
tools3,352 Pulls 1 Tag Updated 6 months ago
-
dmeta-embedding-zh
Dmeta-embedding is a cross-domain, cross-task, out-of-the-box Chinese embedding model.
embedding2,622 Pulls 2 Tags Updated 1 year ago
-
gte
General Text Embeddings (GTE) model. Towards General Text Embeddings with Multi-stage Contrastive Learning trained by Alibaba DAMO Academy.
embedding2,495 Pulls 2 Tags Updated 1 year ago
-
deepseek-v3-UD
(Unsloth Dynamic Quants) A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
2,003 Pulls 3 Tags Updated 7 months ago
-
reader-lm-v2
ReaderLM-v2 is a 1.5B parameter language model that converts raw HTML into beautifully formatted markdown or JSON with superior accuracy and improved longer context handling.
1,164 Pulls 3 Tags Updated 8 months ago
-
bilibili-index
由哔哩哔哩自主研发的大语言模型,Index-1.9B 系列是 Index 系列模型中的轻量版本。
909 Pulls 3 Tags Updated 1 year ago
-
Simplescaling-S1
s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.
tools544 Pulls 3 Tags Updated 7 months ago
-
Kalomaze-Qwen3-16B-A3B
Qwen3-16B-A3B is a rendition of Qwen3-30B-A3B by kalomaze.
tools377 Pulls 3 Tags Updated 4 months ago
-
deepseek-v2.5-1210
DeepSeek-V2.5-1210 is an upgraded version of DeepSeek-V2.5, offering enhanced mathematical, coding, writing, and reasoning capabilities.
339 Pulls 3 Tags Updated 8 months ago
-
rwkv-6-world
RWKV (pronounced RwaKuv) is an RNN with great LLM performance.
308 Pulls 1 Tag Updated 9 months ago
-
Qihoo360-Light-R1-14B-DS
Light-R1-14B-DS is the State-Of-The-Art 14B math model with AIME24 & 25 scores 74.0 & 60.2, outperforming many 32B models.
294 Pulls 1 Tag Updated 6 months ago
-
Qihoo360-Light-R1-32B
Light-R1: Surpassing R1-Distill from Scratch* with $1000 through Curriculum SFT & DPO
tools207 Pulls 1 Tag Updated 6 months ago
-
deepseek-r1-UD
(Unsloth Dynamic Quants) DeepSeek's first-generation of reasoning models with comparable performance to OpenAI-o1, 671B MoE full model, not dense distilled models.
189 Pulls 2 Tags Updated 7 months ago
-
deepseek-v2.5-1210-UD
(Unsloth Dynamic Quants) DeepSeek-V2.5-1210 is an upgraded version of DeepSeek-V2.5, offering enhanced mathematical, coding, writing, and reasoning capabilities.
176 Pulls 3 Tags Updated 7 months ago
-
GLM-4-9B-0414
GLM-4-0414 series models.
118 Pulls 1 Tag Updated 3 months ago
-
Qwen3-UD
(Unsloth Dynamic 2.0 Quants) Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.
tools96 Pulls 1 Tag Updated 4 months ago