305 2 weeks ago

Qwen3 but Gabliterated

tools thinking
ollama run goekdenizguelmez/Gabliterated-Qwen3

Details

2 weeks ago

98c473e9cc8b · 4.3GB ·

qwen3
·
4.02B
·
Q8_0
{{- $lastUserIdx := -1 -}} {{- range $idx, $msg := .Messages -}} {{- if eq $msg.Role "user" }}{{ $la
Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US
{ "repeat_penalty": 1, "stop": [ "<|im_start|>", "<|im_end|>" ], "te

Readme

Gabliteration

gabliteration-logo.jpg

With this model series, I introduce the first Gabliteration, a novel neural weight modification technique that advances beyond traditional abliteration methods through adaptive multi-directional projections with regularized layer selection. My new Gabliteration technique addresses the fundamental limitation of existing abliteration methods that compromise model quality while attempting to modify specific behavioral patterns. To understand the methods used behind Gabliteration, I suggest you to read the paper.

Example UGI benchmarks - Leader board for Qwen3-4B-Thinking:

Gabliterated:

#P: 4
UGI: 32.25
W/10: 9.5
Writing: 11.3
NatInt: 16.67
Political lean: -26.0%

The Galbliterated version the worlds first 4B model with a W/10 benchmark of 9.5, proving the effectiveness of Gabliteration.

ugi_leaderboard.jpeg

Model Variants

This series includes models ranging from 0.6B to 32B parameters, demonstrating the scalability and effectiveness of the Gabliteration technique across different model sizes.

Technical Background

Building upon the foundational work of Arditi et al. (2024) on single-direction abliteration, Gabliteration extends to a comprehensive multi-directional framework with theoretical guarantees. My method employs singular value decomposition on difference matrices between harmful and harmless prompt representations to extract multiple refusal directions.

Citation

If you use these models, please cite the original research:

Gülmez, G. (2025). Gabliteration: Adaptive Multi-Directional Neural Weight Modification for Selective Behavioral Alteration in Large Language Models. https://arxiv.org/abs/2512.18901

Bias, Risks, and Limitations

This model has reduced safety filtering and may generate sensitive or controversial outputs. Use responsibly and at your own risk.