23.3K 2 months ago

Specialized uncensored/abliterated quants for new OpenAI 20B MOE - Mixture of Experts Model at 80+ T/S (quantized Q5_1)

thinking 20b

Models

View all →

Readme

Based on https://huggingface.co/DavidAU/OpenAi-GPT-oss-20b-abliterated-uncensored-NEO-Imatrix-gguf

Feature Value
vision false
thinking true (without switching off)
tools not working
Device Speed, token/s Context VRAM, gb Versions
RTX 3090 24gb ~143 8192 16 Q5_1, 0.12.2
RTX 3090 24gb ~136 16384 16 Q5_1, 0.12.2
M1 Max 32gb ~60 8192 16 Q5_1, 0.12.2
M1 Max 32gb ~60 16384 16 Q5_1, 0.12.2