901 5 months ago

Built from Unsloth's UD Q6_K_XL quant

tools

Models

View all →

Readme

GGUF: https://huggingface.co/unsloth/Devstral-Small-2505-GGUF/blob/main/Devstral-Small-2505-UD-Q6_K_XL.gguf

  • 128k tags default to 128k context which requires around 35GB vRAM.
  • 64k tags default to 64k context which requires around 27GB vRAM, num_batch set to 1024 (up from 512) for performance, you can tune num_batch to trade off speed and vRAM requirements.
  • cline tags are optimised for Cline & Roo Code Agentic Coding.