10 21 hours ago

NIST cybersecurity expert trained on 596 publications, 530K examples. v1.1 adds CSF 2.0, Zero Trust, Post-Quantum Crypto. Dataset: huggingface.co/datasets/ethanolivertroy/nist-cybersecurity-training

Models

View all →

Readme

HackIDLE-NIST-Coder v1.1

A cybersecurity expert model fine-tuned on comprehensive NIST publications.

Training Data

Fine-tuned on 530,912 examples from 596 NIST documents:

  • FIPS standards (cryptography)
  • SP 800 series (security guidelines)
  • SP 1800 series (practice guides)
  • IR series (technical reports)
  • CSWP series (white papers): CSF 2.0, Zero Trust Architecture, Post-Quantum Cryptography, IoT Security, Privacy Engineering

Dataset: ethanolivertroy/nist-cybersecurity-training - Largest public NIST cybersecurity dataset on Hugging Face

Training Method

  • Base Model: Qwen2.5-Coder-7B-Instruct-4bit (chosen for technical content expertise)
  • Technique: LoRA (Low-Rank Adaptation) - 11.5M trainable parameters (0.151% of total)
  • Framework: MLX (Apple Silicon optimized via Metal GPU)
  • Hardware: Apple M4 Max, 128GB RAM
  • Training: 1,000 iterations (~3.5 hours) + 200 checkpoint recovery
  • Final Loss: 1.420 training / 1.512 validation (12.5% improvement over v1.0)
  • Performance: 130-160 tokens/sec inference on M4 Max

What’s New in v1.1

  • +28 CSWP documents: CSF 2.0, Zero Trust Architecture, Post-Quantum Cryptography, IoT Security, Privacy Framework v1.0
  • +7,206 training examples (530,912 total, up from 523,706)
  • Fixed 6,150 malformed DOI links and improved link quality
  • Added latest standards: SP 800-63-4 (Digital Identity Guidelines Rev. 4, July 2025 release)
  • Improved training quality: Better validation loss and convergence

Key Capabilities

  • NIST SP 800-53 Rev. 5 security controls
  • Cybersecurity Framework (CSF) 2.0 with GOVERN function
  • Zero Trust Architecture (SP 800-207)
  • Risk Management Framework (RMF)
  • FIPS cryptographic standards
  • Cloud security (SP 800-145, 800-146)
  • Post-quantum cryptography migration guidance
  • Privacy Framework v1.0
  • Supply chain risk management
  • IoT cybersecurity
  • Digital identity (SP 800-63-4 with passkeys, deep fake detection)

Other Formats

Usage Examples

Basic query:

ollama run etgohome/hackidle-nist-coder:v1.1 "What is Zero Trust Architecture?"

Technical implementation:

ollama run etgohome/hackidle-nist-coder:v1.1 "Write a Python script to audit AWS S3 buckets for NIST compliance"

GRC workflow:

ollama run etgohome/hackidle-nist-coder:v1.1 "Explain NIST RMF authorization process for a cloud system"

License

CC0 1.0 Universal (Public Domain) - All NIST publications are in the public domain. Free for commercial and research use.

Credits

  • NIST Computer Security Resource Center
  • Qwen2.5-Coder base model (Alibaba Cloud)
  • MLX framework (Apple)
  • Training dataset: ethanolivertroy/nist-cybersecurity-training

Version: 1.1 Release Date: October 2025 Model Size: 4.7GB (Q4_K_M GGUF) Context Length: 32K tokens