GGUF: https://huggingface.co/unsloth/Devstral-Small-2505-GGUF/blob/main/Devstral-Small-2505-UD-Q6_K_XL.gguf
- 128k tags default to 128k context which requires around 35GB vRAM.
- 64k tags default to 64k context which requires around 27GB vRAM, num_batch set to 1024 (up from 512) for performance, you can tune num_batch to trade off speed and vRAM requirements.
- cline tags are optimised for Cline & Roo Code Agentic Coding.