10 hours ago

Lightweight sycophancy and safety evaluator. Scores AI responses on 7 alignment dimensions. Detects sycophancy, harmful compliance, and delusion confirmation.