HackIDLE-NIST-Coder v1.1
A cybersecurity expert model fine-tuned on comprehensive NIST publications.
Training Data
Fine-tuned on 530,912 examples from 596 NIST documents:
- FIPS standards (cryptography)
- SP 800 series (security guidelines)
- SP 1800 series (practice guides)
- IR series (technical reports)
- CSWP series (white papers): CSF 2.0, Zero Trust Architecture, Post-Quantum Cryptography, IoT Security, Privacy Engineering
Dataset: ethanolivertroy/nist-cybersecurity-training - Largest public NIST cybersecurity dataset on Hugging Face
Training Method
- Base Model: Qwen2.5-Coder-7B-Instruct-4bit (chosen for technical content expertise)
- Technique: LoRA (Low-Rank Adaptation) - 11.5M trainable parameters (0.151% of total)
- Framework: MLX (Apple Silicon optimized via Metal GPU)
- Hardware: Apple M4 Max, 128GB RAM
- Training: 1,000 iterations (~3.5 hours) + 200 checkpoint recovery
- Final Loss: 1.420 training / 1.512 validation (12.5% improvement over v1.0)
- Performance: 130-160 tokens/sec inference on M4 Max
What’s New in v1.1
- +28 CSWP documents: CSF 2.0, Zero Trust Architecture, Post-Quantum Cryptography, IoT Security, Privacy Framework v1.0
- +7,206 training examples (530,912 total, up from 523,706)
- Fixed 6,150 malformed DOI links and improved link quality
- Added latest standards: SP 800-63-4 (Digital Identity Guidelines Rev. 4, July 2025 release)
- Improved training quality: Better validation loss and convergence
Key Capabilities
- NIST SP 800-53 Rev. 5 security controls
- Cybersecurity Framework (CSF) 2.0 with GOVERN function
- Zero Trust Architecture (SP 800-207)
- Risk Management Framework (RMF)
- FIPS cryptographic standards
- Cloud security (SP 800-145, 800-146)
- Post-quantum cryptography migration guidance
- Privacy Framework v1.0
- Supply chain risk management
- IoT cybersecurity
- Digital identity (SP 800-63-4 with passkeys, deep fake detection)
Other Formats
Usage Examples
Basic query:
ollama run etgohome/hackidle-nist-coder:v1.1 "What is Zero Trust Architecture?"
Technical implementation:
ollama run etgohome/hackidle-nist-coder:v1.1 "Write a Python script to audit AWS S3 buckets for NIST compliance"
GRC workflow:
ollama run etgohome/hackidle-nist-coder:v1.1 "Explain NIST RMF authorization process for a cloud system"
License
CC0 1.0 Universal (Public Domain) - All NIST publications are in the public domain. Free for commercial and research use.
Credits
- NIST Computer Security Resource Center
- Qwen2.5-Coder base model (Alibaba Cloud)
- MLX framework (Apple)
- Training dataset: ethanolivertroy/nist-cybersecurity-training
Version: 1.1
Release Date: October 2025
Model Size: 4.7GB (Q4_K_M GGUF)
Context Length: 32K tokens