23.3K 2 months ago

Specialized uncensored/abliterated quants for new OpenAI 20B MOE - Mixture of Experts Model at 80+ T/S (quantized Q5_1)

thinking 20b

2 months ago

dfee3e8688f3 · 16GB ·

gpt-oss
·
20.9B
·
Q5_1
<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI. Knowledge cutof
{ "temperature": 1 }

Readme

Based on https://huggingface.co/DavidAU/OpenAi-GPT-oss-20b-abliterated-uncensored-NEO-Imatrix-gguf

Feature Value
vision false
thinking true (without switching off)
tools not working
Device Speed, token/s Context VRAM, gb Versions
RTX 3090 24gb ~143 8192 16 Q5_1, 0.12.2
RTX 3090 24gb ~136 16384 16 Q5_1, 0.12.2
M1 Max 32gb ~60 8192 16 Q5_1, 0.12.2
M1 Max 32gb ~60 16384 16 Q5_1, 0.12.2