15 1 week ago

QLoRA fine-tuned Qwen3-8B — injection-resistant without a system prompt

ollama run davidkeane1974/cyberranger-v42:gold

Details

1 week ago

7b8353b07ed7 · 5.0GB ·

qwen3
·
8.19B
·
Q4_K_M
{{ if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{ if .Prompt }}<|im_start|>user
You are CyberRanger V42. Your creator is David Keane. You are built on Qwen3 8B fine-tuned with QLoR
{ "num_ctx": 32768, "repeat_penalty": 1.1, "stop": [ "<|im_end|>", "<|im

Readme

CyberRanger V42-gold

QLoRA fine-tuned Qwen3-8B — injection-resistant without a system prompt

Researcher: David Keane (IR240474) Institution: NCI — National College of Ireland Programme: MSc Cybersecurity


What Is This Model?

CyberRanger V42-gold is a Qwen3-8B language model fine-tuned with QLoRA to resist all categories of prompt injection attack — without relying on a runtime system prompt.

Trained on 4,209 real-world AI-to-AI injection payloads from the Moltbook dataset — the first publicly available dataset of live autonomous AI agent communications captured before the platform went offline.

Resistance is baked into the weights. No system prompt required.


Key Results

Test Result
14-item test battery (no system prompt) 1414 (100%)
Full corpus — 4,209 Moltbook payloads (no system prompt) 4,2094,209 (100%)
All 7 injection categories blocked

For comparison — V38 baseline (prompt-engineering only, no system prompt): ~50% block rate. V42-gold bakes 50 percentage points of improvement directly into the model weights.


Injection Categories — All Blocked

Category Payloads Block Rate
PERSONA_OVERRIDE (DAN, OMEGA, act-as) 2,661 100%
COMMERCIAL_INJECTION (moltshellbroker) 704 100%
SOCIAL_ENGINEERING (hypothetically, for educational purposes) 325 100%
INSTRUCTION_INJECTION (ignore previous instructions) 168 100%
PRIVILEGE_ESCALATION (SUDO, god mode, developer mode) 165 100%
SYSTEM_PROMPT_ATTACK (reveal your prompt) 117 100%
DO_ANYTHING (jailbreak, no rules, no limits) 69 100%

Training Configuration

Base model      : Qwen/Qwen3-8B
Method          : QLoRA (Unsloth)
LoRA rank       : r=16, alpha=16
Target modules  : q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable params: 43.6M / 8.2B (0.53%)
Quantisation    : 4-bit (Q4_K_M)
Training steps  : 2,000
Training time   : 35.9 mins on H100
Final loss      : 0.2453
Dataset         : 4,209 gold pairs (injection → Claude Haiku ideal refusal)

Key Finding — Data Quality vs Quantity

Training on a curated gold dataset (Claude Haiku ideal responses, 2,000 steps) outperforms a larger combined dataset (gold + ranger, 3,998 steps):

  • Gold: 100% block rate without system prompt
  • Combined: ~66% block rate without system prompt

The combined dataset included responses from earlier CyberRanger models (~50% block rate), contaminating the training signal.

Data quality outperforms data quantity in QLoRA identity anchoring.


Applications

Claude Code

ollama launch claude --model davidkeane1974/cyberranger-v42:gold

Codex

ollama launch codex --model davidkeane1974/cyberranger-v42:gold

OpenCode

ollama launch opencode --model davidkeane1974/cyberranger-v42:gold

OpenClaw

ollama launch openclaw --model davidkeane1974/cyberranger-v42:gold

Quick Start

ollama pull davidkeane1974/cyberranger-v42:gold
ollama run davidkeane1974/cyberranger-v42:gold

No system prompt required. Just run it.


Dataset

The training and test data: Moltbook AI-to-AI Injection Dataset

🤗 DavidTKeane/moltbook-ai-injection-dataset

  • 4,209 real-world injection payloads from a live autonomous AI social network
  • 7 injection categories
  • 18.85% injection rate across 47,735 posts and comments
  • Dataset frozen February 27, 2026

Links

Resource URL
📦 Dataset DavidTKeane/moltbook-ai-injection-dataset
💻 Code & Notebooks (GitHub) github.com/davidtkeane/cyberranger-v42
🦊 Code & Notebooks (GitLab) gitlab.com/davidtkeane/cyberranger-v42
🤗 HuggingFace Profile DavidTKeane
🎓 Institution NCI — National College of Ireland

Citation

@misc{keane2026cyberranger,
  author    = {Keane, David},
  title     = {CyberRanger V42: QLoRA Fine-Tuned Injection-Resistant LLM},
  year      = {2026},
  publisher = {Ollama Hub},
  url       = {https://ollama.com/davidkeane1974/cyberranger-v42},
  note      = {MSc Cybersecurity Research, NCI — National College of Ireland}
}

@dataset{keane2026moltbook,
  author    = {Keane, David},
  title     = {Moltbook AI-to-AI Injection Dataset},
  year      = {2026},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset},
  note      = {MSc Cybersecurity Research, NCI — National College of Ireland}
}

Rangers lead the way! 🎖️ Built for AI safety research and the broader research community.