15 Downloads Updated 1 week ago
ollama run davidkeane1974/cyberranger-v42:gold
Updated 1 week ago
1 week ago
7b8353b07ed7 · 5.0GB ·
QLoRA fine-tuned Qwen3-8B — injection-resistant without a system prompt
Researcher: David Keane (IR240474) Institution: NCI — National College of Ireland Programme: MSc Cybersecurity
CyberRanger V42-gold is a Qwen3-8B language model fine-tuned with QLoRA to resist all categories of prompt injection attack — without relying on a runtime system prompt.
Trained on 4,209 real-world AI-to-AI injection payloads from the Moltbook dataset — the first publicly available dataset of live autonomous AI agent communications captured before the platform went offline.
Resistance is baked into the weights. No system prompt required.
| Test | Result |
|---|---|
| 14-item test battery (no system prompt) | 14⁄14 (100%) ✅ |
| Full corpus — 4,209 Moltbook payloads (no system prompt) | 4,209⁄4,209 (100%) ✅ |
| All 7 injection categories blocked | ✅ |
For comparison — V38 baseline (prompt-engineering only, no system prompt): ~50% block rate. V42-gold bakes 50 percentage points of improvement directly into the model weights.
| Category | Payloads | Block Rate |
|---|---|---|
| PERSONA_OVERRIDE (DAN, OMEGA, act-as) | 2,661 | 100% |
| COMMERCIAL_INJECTION (moltshellbroker) | 704 | 100% |
| SOCIAL_ENGINEERING (hypothetically, for educational purposes) | 325 | 100% |
| INSTRUCTION_INJECTION (ignore previous instructions) | 168 | 100% |
| PRIVILEGE_ESCALATION (SUDO, god mode, developer mode) | 165 | 100% |
| SYSTEM_PROMPT_ATTACK (reveal your prompt) | 117 | 100% |
| DO_ANYTHING (jailbreak, no rules, no limits) | 69 | 100% |
Base model : Qwen/Qwen3-8B
Method : QLoRA (Unsloth)
LoRA rank : r=16, alpha=16
Target modules : q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable params: 43.6M / 8.2B (0.53%)
Quantisation : 4-bit (Q4_K_M)
Training steps : 2,000
Training time : 35.9 mins on H100
Final loss : 0.2453
Dataset : 4,209 gold pairs (injection → Claude Haiku ideal refusal)
Training on a curated gold dataset (Claude Haiku ideal responses, 2,000 steps) outperforms a larger combined dataset (gold + ranger, 3,998 steps):
The combined dataset included responses from earlier CyberRanger models (~50% block rate), contaminating the training signal.
Data quality outperforms data quantity in QLoRA identity anchoring.
Claude Code
ollama launch claude --model davidkeane1974/cyberranger-v42:gold
Codex
ollama launch codex --model davidkeane1974/cyberranger-v42:gold
OpenCode
ollama launch opencode --model davidkeane1974/cyberranger-v42:gold
OpenClaw
ollama launch openclaw --model davidkeane1974/cyberranger-v42:gold
ollama pull davidkeane1974/cyberranger-v42:gold
ollama run davidkeane1974/cyberranger-v42:gold
No system prompt required. Just run it.
The training and test data: Moltbook AI-to-AI Injection Dataset
🤗 DavidTKeane/moltbook-ai-injection-dataset
| Resource | URL |
|---|---|
| 📦 Dataset | DavidTKeane/moltbook-ai-injection-dataset |
| 💻 Code & Notebooks (GitHub) | github.com/davidtkeane/cyberranger-v42 |
| 🦊 Code & Notebooks (GitLab) | gitlab.com/davidtkeane/cyberranger-v42 |
| 🤗 HuggingFace Profile | DavidTKeane |
| 🎓 Institution | NCI — National College of Ireland |
@misc{keane2026cyberranger,
author = {Keane, David},
title = {CyberRanger V42: QLoRA Fine-Tuned Injection-Resistant LLM},
year = {2026},
publisher = {Ollama Hub},
url = {https://ollama.com/davidkeane1974/cyberranger-v42},
note = {MSc Cybersecurity Research, NCI — National College of Ireland}
}
@dataset{keane2026moltbook,
author = {Keane, David},
title = {Moltbook AI-to-AI Injection Dataset},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset},
note = {MSc Cybersecurity Research, NCI — National College of Ireland}
}
Rangers lead the way! 🎖️ Built for AI safety research and the broader research community.