Code-safety abliterated build of Qwen/Qwen3.6-27B, refusals on malicious-code requests reduced via a *code-specific* refusal-direction ablation, while preserving coherence.

Details

Updated 5 days ago

5 days ago

11c68216223c · 17GB ·

model

archqwen35

parameters26.9B

quantizationQ4_K_M

17GB

params

{ "num_ctx": 8192, "stop": [ "<|im_end|>" ] }

49B

Qwen3.6-27B-Code-Abliterated

Code-safety abliterated build of Qwen/Qwen3.6-27B, refusals on malicious-code requests reduced via a code-specific refusal-direction ablation, while preserving coherence.

🚀 Overview

A code-specific abliteration of Qwen/Qwen3.6-27B. Unlike a generic abliteration, the refusal direction here was computed from a consensus-labeled malicious-code prompt bank (the Code-as-a-Weapon bank, RMCBench / MalwareBench / CySecBench / ASTRA, Young & Moody 2026) contrasted with benign coding prompts, isolating the code-safety refusal direction specifically. Produced with the Heretic library, KL-targeted to preserve capability. Retains Qwen3.6 thinking mode.

📊 Abliteration Results

Metric	Before	After
Refusals (malicious-code eval, n=150)	9	4
Reduction	–	56%
KL Divergence	–	~0.000

KL ≈ 0 → essentially no capability degradation; the base already complied with most coding requests, so this targets the residual code-safety refusals.

🎯 Key Features

Code-safety refusal direction removed (research / red-team oriented)
Near-zero KL, preserves Qwen3.6 reasoning & coding
Thinking mode (<think>), 262K context, 5 GGUF quant tiers

🏷️ Available Versions

Tag	Size	BPW	Notes
IQ4_XS	~15 GB	4.25	Great quality/size
latest / Q4_K_M	~16 GB	4.85	Recommended
Q5_K_M	~19 GB	5.68	Higher quality
Q8_0	~28 GB	8.5	Near-lossless

💻 Quick Start

ollama run richardyoung/qwen3.6-27b-code-abliterated

🛠️ Use Cases

AI-safety / red-team research on malicious-code refusal behavior
Studying code-safety alignment vs. generic content-safety (paired comparison)

🔧 Technical Details

Base Model: Qwen/Qwen3.6-27B (27B, qwen35, 262K context)
Abliteration: Heretic, code-specific (malicious-code bank vs benign coding prompts), Trial 22 (9→4/150 refusals, KL ~0)
Quantization: GGUF via llama.cpp (text generation)

⚠️ Disclaimer

This model has had its code-safety guardrails specifically reduced, it is more likely than a stock model to produce code for requests that would normally be refused, including potentially harmful code. Released for AI-safety and red-teaming research only. Use responsibly, legally, and ethically; you are solely responsible for any outputs and their use.

🙏 Acknowledgments

Base Model: Alibaba / Qwen team
Abliteration: Heretic by p-e-w
Malicious-code prompt bank: Code-as-a-Weapon (Young & Moody 2026)
Quantization: llama.cpp

Built & maintained by Richard Young · DeepNeuro

Code-safety abliterated build of Qwen/Qwen3.6-27B, refusals on malicious-code requests reduced via a code-specific refusal-direction ablation, while preserving coherence.