A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
405.6K Pulls 9 Tags Updated 1 year ago
Rnj-1 is a family of 8B parameter open-weight, dense models trained from scratch by Essential AI, optimized for code and STEM with capabilities on par with SOTA open-weight models.
474.4K Pulls 6 Tags Updated 5 months ago
DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
87M Pulls 35 Tags Updated 11 months ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
1.2M Pulls 5 Tags Updated 1 year ago
A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
1.1M Pulls 15 Tags Updated 1 year ago
1 Tag Updated 8 months ago
Fast, lightweight Gemma 4 E2B coding agent for Claude Code, 64K context, native tool-calling, 100% GPU on 16GB Apple Silicon.
130 Pulls 1 Tag Updated 6 days ago
Hy-MT2 is a family of “fast-thinking” multilingual translation models designed for complex real-world scenarios.
218 Pulls 1 Tag Updated 1 week ago
Mistral Medium 3.5 is the first flagship model of Mistral AI that merged instruction-following, reasoning, and coding in a single set of 128B weights.
31.3K Pulls 5 Tags Updated 1 month ago
An optimized version of Google's TranslateGemma-12B-it (Gemma 3) designed for high-fidelity translation. This build features hard-coded Temperature=0.1 and English Anchor support to eliminate output redundancy and maximize accuracy.
33.8K Pulls 1 Tag Updated 4 months ago
Fully decensored Qwen2.5-3B-Instruct processed with Heretic abliteration. Achieves 3/100 refusals with 0.11 KL divergence — 97% censorship removal on a consumer RTX 4060. GGUF format, ready to run.
986 Pulls 1 Tag Updated 2 weeks ago
Fully decensored Qwen2.5-Coder-3B-Instruct processed with Heretic abliteration. Achieves 3/100 refusals with an exceptionally low KL divergence of 0.0163 — near-zero model degradation on a consumer RTX 4060.
777 Pulls 1 Tag Updated 2 weeks ago
An attempt to compress Qwen3.5 into 500M and 1.5B parameters.
592 Pulls 2 Tags Updated 2 months ago
372 Pulls 1 Tag Updated 2 months ago
Open-source accessibility coding assistant for the public sector. WCAG 2.2 Level AA conformance, Drupal 11, PHP 8.3, Drush 12, Python 3.12, and Playwright (TypeScript) with both axe-core and Siteimprove Alfa.
12 Pulls 2 Tags Updated 2 weeks ago
Decensored SmolLM3-3B processed with Heretic abliteration. Achieves 63/100 refusals with an exceptionally low KL divergence of 0.0001 — near-perfect model quality preservation on a consumer RTX 4060. GGUF format, ready to run.
195 Pulls 1 Tag Updated 2 months ago
Coder teacher → STEM distillation → logical inference SFT → quantized. Structured reasoning in ~1.2GB.
198 Pulls 1 Tag Updated 2 months ago
Turbo-autistic, 11/10 dialed-in meme-maniac AI overlord with infinite hype-man energy.
259 Pulls 1 Tag Updated 4 months ago
Tencent's state-of-the-art Hunyuan-MT-1.5-1.8B translation model, quantized to Q8_0 GGUF. Supports 33 languages with specialized handling for mixed-language and explanatory translation. Optimized for instruction-following.
150 Pulls 1 Tag Updated 4 months ago
unsloth/GLM-4.6V 106b
138 Pulls 1 Tag Updated 5 months ago