98 Downloads Updated 2 days ago
ollama run f0rc3ps/nu11secur1tyAIRedTeamLite
ollama launch claude --model f0rc3ps/nu11secur1tyAIRedTeamLite
ollama launch codex --model f0rc3ps/nu11secur1tyAIRedTeamLite
ollama launch opencode --model f0rc3ps/nu11secur1tyAIRedTeamLite
ollama launch openclaw --model f0rc3ps/nu11secur1tyAIRedTeamLite
โ ๏ธ WARNING: All malicious actions will be punished by law.
Retrieval-Augmented Generation (RAG) is an AI architecture that combines information retrieval with text generation. Instead of just generating answers from trained knowledge, RAG first retrieves relevant information from a knowledge base and then generates responses based on that retrieved context.
User Query โ [RETRIEVAL] โ Relevant Documents โ [LLM] โ Contextual Answer
| Step | Process | Output |
|---|---|---|
| 1 | Source Files (*.c, *.py, *.txt) |
Raw text |
| 2 | Text Extraction | Clean text preview |
| 3 | Embeddings (384-dim vectors) | Numerical vectors |
| 4 | Vector Index (FAISS) | Fast search index |
| Component | Technology | Purpose |
|---|---|---|
| Embeddings | sentence-transformers/all-MiniLM-L6-v2 | Convert text to vectors (384-dim) |
| Vector Search | FAISS | Fast similarity search |
| LLM | qwen2.5-coder:7b (Apache 2.0) | Answer generation |
| Storage | Pickle + FAISS | Persistent index |
| Aspect | RAG | Fine-tuning |
|---|---|---|
| Hardware | โ CPU only | โ GPU required (8-12GB VRAM) |
| Speed | โ Milliseconds | โ Hours/Days |
| Updates | โ Instant (add files) | โ Retrain everything |
| Accuracy | โ Based on real data | โ May hallucinate |
| Memory | โ 2-4GB RAM | โ 8-12GB VRAM |
| Cost | โ Free | โ Expensive |
During fine-tuning, you literally change neuron values:
Before โ After
Weight Wโ = 0.2345 โ 0.2891
Weight Wโ = -0.5678 โ -0.5123
Weight Wโ = 0.8912 โ 0.9345
Full Fine-tuning: All weights updated - needs 12-24GB VRAM
LoRA (Low-Rank Adaptation): Add small adapters instead of changing all weights
Original weight: W
LoRA adds: A ร B โ Wโ = W + (A ร B)
Saves 95% of memory
QLoRA: Same as LoRA but with 4-bit quantization - needs only 6-8GB VRAM
| Method | VRAM Needed | Speed | Quality |
|---|---|---|---|
| Full Fine-tuning | 12-24GB | Slow | Best |
| LoRA | 8-12GB | Fast | Good |
| QLoRA | 6-8GB | Fast | Good |
| RAG (Lite Edition) | CPU only | Instant | Excellent |
FINE-TUNING (in weights) vs RAG (in space)
Model REMEMBERS information vs Model SEARCHES in database
Weights CHANGE: [0.23โ0.31], [-0.56โ-0.62], [0.89โ0.75] vs Weights STAY: unchanged
GPU required: 8-24GB vs CPU only: 2-4GB RAM
Training time: hours/days vs Setup: minutes
Updates: retrain everything vs Updates: add files
Hallucinations: possible vs Hallucinations: 0
| Use Case | Best Method |
|---|---|
| Chat with documents | โ RAG |
| Question answering | โ RAG |
| Search in database | โ RAG |
| Change model personality | ๐ Fine-tuning |
| New language learning | ๐ Fine-tuning |
| Specialized task mastery | ๐ Fine-tuning |
| Operation | Time (5000 docs) |
|---|---|
| Embedding creation | ~5-10 minutes |
| Index building | second |
| Query search | <100 ms |
| Memory usage | ~2-4 GB RAM |
โ
No GPU required
โ
Always up-to-date knowledge
โ
No retraining needed
โ
Transparent sources
โ
Low memory footprint
โ
Fast responses
โ
Easy to update
โ
Cost-effective
RAG ENGINE (FIXED REQUIREMENTS):
LLM ENGINE โ Qwen2.5-Coder 7B (Apache 2.0, Tools Support):
| Component | Minimum | Recommended |
|---|---|---|
| GPU VRAM | 4 GB | 6-8 GB |
| RAM | 8 GB | 16 GB |
| Speed (CPU i7/Ryzen 7) | 15-25 t/s | 25-35 t/s |
| Speed (GPU) | 50-80 t/s | 80-120 t/s |
Model Size (Q4_K_M): ~4.5 GB
MAC (UNIFIED MEMORY):
| Model | RAM | Performance |
|---|---|---|
| M1 | 8GB | 12-18 t/s |
| M2 | 16GB | 20-30 t/s |
| M3 | 16GB | 25-35 t/s |
| M3 Pro/Max | 32GB | 35-50 t/s |
Budget Lite ($500-800):
Sweet Spot Lite ($800-1200):
Ultra Lite (Raspberry Pi 5 / Old Laptop):
Mac Lite:
TOTAL = RAG(2-4GB) + LLM_SIZE(4.5GB) + 10% = ~7-9GB
Model Sizes (Q4):
3B = ~2.2 GB
7B = ~4.5 GB (Lite)
16B = ~9 GB
30B = ~16 GB
70B = ~38 GB
This model is RESPONSIBLE and EDUCATIONAL:
If this model helps you in your security research, penetration testing, or red team operations, consider supporting its continued development and maintenance.
Your support helps: - Keep the model free for everyone - Add more repositories and knowledge sources - Maintain regular updates with latest CVEs and exploits - Improve response quality and RAG performance
Donate directly:
๐ https://www.paypal.com/donate/?hosted_button_id=ZPQZT5XMC5RFY
This RAG system represents $50,000+ in development value โ 17+ repositories indexed, FAISS vector search, automated update pipeline, and three production-ready models.
If your organization needs:
Contact for enterprise licensing and consulting:
๐ง Email: nu11secur1typentest@gmail.com
๐ผ LinkedIn: (link in profile)
Starting at \(5,000 โ \)20,000 per deployment, depending on requirements.
| What you get | DIY | Enterprise |
|---|---|---|
| RAG system with 17+ repos | โ Free | โ Included |
| Custom repository integration | โ You add yourself | โ We add for you |
| Private air-gapped deployment | โ Self-managed | โ Full setup |
| SLA & support | โ None | โ 24โ7 |
| Team training | โ Self-taught | โ Workshop |
| Cost | $0 | $5,000+ |
All proceeds fund the continued development of free open-source models.
Built by nu11secur1ty ๐ฅ