fauxpaslife

fauxpaslife

AI Researcher @ delta : kitsune

nanbeige4.1

3B model that shouldn't be this good - crushes benchmarks through deep chain-of-thought reasoning

1,267 Pulls 1 Tag Updated 2 months ago
nanbeige4.1-python-deepthink

Fine-tuned version of Nanbeige 4.1 3B specialized for Python code generation with direct, focused output.

3b

516 Pulls 3 Tags Updated 2 months ago
codellama-python-13b-q6

Specialized CodeLlama variant fine-tuned specifically for Python code generation. 13B params, Q6_K quant (very high quality, minimal loss).

380 Pulls 1 Tag Updated 2 months ago
gemma-4-E4B-it-textonly-sysprmpt-Q4_K_M

A structurally extracted, text-only iteration of Google's multimodal gemma-4-E4B-it model. Vision and audio encoders have been fully decoupled to minimize VRAM footprint for text-centric workloads. System Prompt to address lost abilities.

tools thinking

258 Pulls 1 Tag Updated 2 weeks ago
arch-router

A specialized 1.5B parameter model for intelligent routing between multiple LLMs based on domain and action preferences.

1.5b

112 Pulls 1 Tag Updated 2 months ago
squishy

A tiny 150M completion model trained from scratch for short story generation and small-model pipeline validation. It is best for generating small story prompts. (Think young child telling small stories. And cute at times.)

150m

36 Pulls 1 Tag Updated 1 week ago
MINT-empathy-Qwen3-4B

MINT (Multi-turn Inter-tactic Novelty Training) model for empathic dialogue, fine-tuned from Qwen/Qwen3-4B.

3 Pulls 1 Tag Updated 2 days ago
Deimos-A1

Deimos A1 is a concise chain-of-thought (CCoT) fine-tune of Qwen3.5-4B. It produces dense, stepwise <think> blocks averaging ~1/8 the tokens of the base model while improving accuracy on every reasoning benchmark measured.

2 Pulls 1 Tag Updated yesterday