9 Downloads Updated 2 weeks ago
ollama run stratosphere/qwen2.5-1.5b-slips-immune-risk
A fine-tuned version of Qwen2.5-1.5B-Instruct specialized in cause analysis and risk assessment for Slips IDS alert logs.
Trained at the Stratosphere Research Laboratory, Czech Technical University in Prague.
Slips is a machine-learning-based network intrusion detection system (IDS). It generates DAG-structured evidence logs that group related security events for an IP address and time window.
This model performs two complementary analyst tasks over the same Slips DAG:
Optimized for local and edge deployment through Ollama. The model is small enough to run on constrained systems while producing structured analyst-facing outputs.
Fine-tuning method: SFT using Unsloth + LoRA on combined cause+risk records. Training data: Best-of-N responses selected from GPT-4o, GPT-4o-mini, Qwen2.5 3B, and Qwen2.5 1.5B baseline outputs, scored by an LLM judge.
ollama run stratosphere/qwen2.5-1.5b-slips-immune-risk
The default latest tag points to q4_k_m.
| Tag | Use case |
|---|---|
latest / q4_k_m |
Recommended default; smallest and fastest |
q5_k_m |
Better quality/size tradeoff |
q8_0 |
Highest quality quantized version |
# Default Q4_K_M
ollama pull stratosphere/qwen2.5-1.5b-slips-immune-risk
# Balanced quality/size
ollama pull stratosphere/qwen2.5-1.5b-slips-immune-risk:q5_k_m
# Highest quality quantized model
ollama pull stratosphere/qwen2.5-1.5b-slips-immune-risk:q8_0
This model was trained with two separate user prompt formats. Run both prompts on the same incident DAG if you want a complete cause+risk report.
The Ollama model uses Qwen chat formatting internally, so send the prompt directly as the user message.
Both prompts expect this metadata plus the Slips DAG evidence:
INCIDENT METADATA:
- Incident ID: <incident_id>
- Source IP: <source_ip>
- Timewindow: <timewindow>
- Accumulated Threat Level: <threat_level>
- Time Range: <start> to <end>
- Total Events: <event_count>
SECURITY EVIDENCE:
<compacted Slips DAG analysis>
You are a cybersecurity analyst. Analyze the following network security incident and provide a structured analysis of possible causes.
INCIDENT METADATA:
- Incident ID: <incident_id>
- Source IP: <source_ip>
- Timewindow: <timewindow>
- Accumulated Threat Level: <threat_level>
- Time Range: <timeline>
- Total Events: <event_count>
SECURITY EVIDENCE:
<dag_analysis>
Output Requirements:
- Respond with ONLY the analysis content
- Do NOT include any prefixes (like "AI:"), statistics, or metadata
- Do NOT include token counts, timing information, or performance stats
- Use this exact structure:
**Possible Causes:**
**1. Malicious Activity:**
• [Specific attack technique or malicious cause]
• [Additional malicious possibilities if relevant]
**2. Legitimate Activity:**
• [Benign operational cause]
• [Additional legitimate possibilities if relevant]
**3. Misconfigurations:**
• [Technical misconfigurations that could cause this behavior]
**Conclusion:** [1-2 sentence assessment of most likely cause category with recommendation for further investigation]
Guidelines:
- Be succinct (fewer words than raw evidence)
- Focus on relevant causes only (attack techniques, misconfigurations, legitimate operations)
- Use precise analyst-level language
- Maintain consistent structure and depth across all analyses
- Avoid generic definitions or unnecessary context
You are a cybersecurity analyst. Analyze the following network security incident and provide a structured risk assessment.
INCIDENT METADATA:
- Incident ID: <incident_id>
- Source IP: <source_ip>
- Timewindow: <timewindow>
- Accumulated Threat Level: <threat_level>
- Time Range: <timeline>
- Total Events: <event_count>
SECURITY EVIDENCE:
<dag_analysis>
Output Requirements:
- Respond with ONLY the assessment content
- Do NOT include any prefixes (like "AI:"), statistics, or metadata
- Do NOT include token counts, timing information, or performance stats
- Use this exact structure:
**Risk Level:** [Critical/High/Medium/Low]
**Justification:** [1-2 sentence technical justification for the risk level]
**Business Impact:** [Single clear sentence describing the most relevant business effect]
**Likelihood of Malicious Activity:** [High/Medium/Low] - [Brief rationale]
**Investigation Priority:** [Immediate/High/Medium/Low] - [Brief justification]
Guidelines:
- Use only the four risk levels: Critical, High, Medium, Low
- Keep justifications concise and technical
- Focus business impact on most relevant effect (data access, service disruption, etc.)
- Use consistent language for likelihood assessments
- Maintain uniform structure and depth across all assessments
Cause analysis:
ollama run stratosphere/qwen2.5-1.5b-slips-immune-risk:q4_k_m '<paste the cause analysis prompt here>'
Risk assessment:
ollama run stratosphere/qwen2.5-1.5b-slips-immune-risk:q4_k_m '<paste the risk assessment prompt here>'
For repeatable evaluation, use temperature 0 through the Ollama API or the OpenAI-compatible endpoint.
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:11434/v1",
api_key="ollama",
)
response = client.chat.completions.create(
model="stratosphere/qwen2.5-1.5b-slips-immune-risk:q4_k_m",
messages=[
{"role": "user", "content": risk_prompt},
],
temperature=0,
max_tokens=1024,
)
print(response.choices[0].message.content)
The risk model was evaluated on held-out Slips IDS incidents with an independent LLM-as-judge. The judge ranks model outputs for cause analysis and risk assessment separately while randomizing labels to reduce position bias.
The local evaluation path in this repository is:
python3 run_finetuned_inference_risk.py \
--model-name stratosphere/qwen2.5-1.5b-slips-immune-risk:q4_k_m \
--url http://localhost:11434/v1 \
--input risk_filtered_eval.json \
--output risk_finetuned_eval_results.json
python3 ../alert_summary/evaluate_risk.py \
--input risk_finetuned_eval_results.json \
--output ../alert_summary/results/risk_finetuned_results.json
Reported held-out results from the risk model card:
| Model | Avg Position | Avg Cause Score | Avg Risk Score | Win Rate |
|---|---|---|---|---|
| GPT-4o | 1.70 | 15.33 | 11.99 | 40.3% |
| Qwen2.5-1.5B Immune Risk | 1.73 | 15.58 | 10.27 | 37.3% |
| GPT-4o-mini | 2.11 | 15.31 | 11.63 | 19.4% |
| Qwen2.5 1.5B baseline | 3.48 | 9.15 | 8.79 | 3.0% |
| Qwen2.5 3B baseline | 3.53 | 7.40 | 9.61 | 0.0% |
This fine-tuned model is released under the Apache 2.0 License, consistent with the base Qwen2.5 model license.