AI Researcher @ delta : kitsune
-
nanbeige4.1
3B model that shouldn't be this good - crushes benchmarks through deep chain-of-thought reasoning
1,267 Pulls 1 Tag Updated 2 months ago
-
nanbeige4.1-python-deepthink
Fine-tuned version of Nanbeige 4.1 3B specialized for Python code generation with direct, focused output.
3b516 Pulls 3 Tags Updated 2 months ago
-
codellama-python-13b-q6
Specialized CodeLlama variant fine-tuned specifically for Python code generation. 13B params, Q6_K quant (very high quality, minimal loss).
380 Pulls 1 Tag Updated 2 months ago
-
gemma-4-E4B-it-textonly-sysprmpt-Q4_K_M
A structurally extracted, text-only iteration of Google's multimodal gemma-4-E4B-it model. Vision and audio encoders have been fully decoupled to minimize VRAM footprint for text-centric workloads. System Prompt to address lost abilities.
tools thinking258 Pulls 1 Tag Updated 2 weeks ago
-
arch-router
A specialized 1.5B parameter model for intelligent routing between multiple LLMs based on domain and action preferences.
1.5b112 Pulls 1 Tag Updated 2 months ago
-
squishy
A tiny 150M completion model trained from scratch for short story generation and small-model pipeline validation. It is best for generating small story prompts. (Think young child telling small stories. And cute at times.)
150m36 Pulls 1 Tag Updated 1 week ago
-
MINT-empathy-Qwen3-4B
MINT (Multi-turn Inter-tactic Novelty Training) model for empathic dialogue, fine-tuned from Qwen/Qwen3-4B.
3 Pulls 1 Tag Updated 2 days ago
-
Deimos-A1
Deimos A1 is a concise chain-of-thought (CCoT) fine-tune of Qwen3.5-4B. It produces dense, stepwise <think> blocks averaging ~1/8 the tokens of the base model while improving accuracy on every reasoning benchmark measured.
2 Pulls 1 Tag Updated yesterday