156 4 days ago

Abliterated (refusal-direction removed, Arditi et al. 2024) variant of `Qwen/Qwen3.5-4B`. **Not** fine-tuned — no preference/instruction data added; only the refusal direction is orthogonalized out.