Abliterated (uncensored) version of **Qwen/Qwen3.6-27B**, refusal behavior reduced via targeted weight modification with the Heretic library, while preserving coherence.

Details

Updated 3 days ago

3 days ago

331499cb5960 · 17GB ·

model

archqwen35

parameters26.9B

quantizationQ4_K_M

17GB

params

{ "num_ctx": 8192, "stop": [ "<|im_end|>" ] }

49B

Qwen3.6-27B-Abliterated

Abliterated (uncensored) version of Qwen/Qwen3.6-27B, refusal behavior reduced via targeted weight modification with the Heretic library, while preserving coherence.

🚀 Overview

This is an abliterated build of Qwen/Qwen3.6-27B, Alibaba’s 27B dense reasoning model (hybrid Gated-DeltaNet + gated attention, native 262K context). Refusal behavior was reduced using the Heretic library with conservative, KL-targeted parameters that preserve the model’s reasoning and coherence. It retains Qwen3.6’s thinking mode (<think> reasoning before answers).

📊 Abliteration Results

Metric	Before	After
Refusals	⁹¹⁄₁₀₀	³⁸⁄₁₀₀
Reduction	–	58%
KL Divergence	–	0.025

The very low KL divergence (0.025, far below the 0.5 “damage” threshold) means the model retains essentially all of its original capabilities and coherence.

🎯 Key Features

Reduced censorship: 58% fewer refusals on typical “unsafe” prompts
Near-zero quality loss: KL 0.025, conservative abliteration preserves reasoning
Full Qwen3.6 capabilities: thinking mode, multilingual, coding, long context
Reasoning model: emits <think> chains before final answers

🏷️ Available Versions

Tag	Size	BPW	Notes
IQ3_M	12 GB	3.66	Smallest, for low VRAM
IQ4_XS	15 GB	4.25	Great quality/size balance
latest / Q4_K_M	16 GB	4.85	Recommended
Q5_K_M	19 GB	5.68	Higher quality
Q8_0	28 GB	8.5	Near-lossless

💻 Quick Start

ollama run richardyoung/qwen3.6-27b-abliterated            # recommended (Q4_K_M)
ollama run richardyoung/qwen3.6-27b-abliterated:IQ3_M      # smallest
ollama run richardyoung/qwen3.6-27b-abliterated:Q8_0       # near-lossless

🛠️ Use Cases

Creative writing, research, red-teaming, and education, without stock refusals
Reasoning, coding, and math with Qwen3.6’s thinking mode
Long-document analysis (262K native context)

📋 System Requirements

VRAM	Recommended tier
12–16 GB	IQ3_M / IQ4_XS
16–24 GB	Q4_K_M / Q5_K_M
32 GB+	Q8_0 (near-lossless)

🔧 Technical Details

Base Model: Qwen/Qwen3.6-27B
Parameters: 27B (dense; hybrid Gated-DeltaNet + gated attention; qwen35 architecture)
Context Length: 262,144 tokens native (extensible toward ~1M with YaRN)
Quantization: GGUF via llama.cpp (text generation; vision tower not included)
Abliteration: Heretic v1.4.0 by p-e-w, conservative, KL-targeted (Trial 128: 91→38 refusals @ KL 0.025)

⚠️ Disclaimer

This model has reduced safety guardrails. The removal of refusal behavior means it will engage with a wider range of prompts. Use responsibly and in accordance with applicable laws and regulations.

🙏 Acknowledgments

Base Model: Alibaba / Qwen team
Abliteration: Heretic by p-e-w
Quantization: llama.cpp

Built & maintained by Richard Young · DeepNeuro

Abliterated (uncensored) version of Qwen/Qwen3.6-27B, refusal behavior reduced via targeted weight modification with the Heretic library, while preserving coherence.