1,109 8 months ago

Mistral-Small-3.1-24B-Instruct-2503 is a quantized GGUF variant of Mistral-small 3.1 24B Instruct (2503), optimized to run smoothly on GPUs with as low as 16GB VRAM.

tools 24b

8 months ago

4e994e0f85a0 · 13GB ·

llama
·
23.6B
·
IQ4_NL
{{- range $index, $_ := .Messages }} {{- if eq .Role "system" }}[SYSTEM_PROMPT]{{ .Content }}[/SYSTE
You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup head
Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US
{ "stop": [ "<s>", "[INST]" ], "temperature": 0.15 }

Readme

Mistral-Small-3.1-24B-Instruct-2503 (GGUF)

This repository provides the “Mistral-Small-3.1-24B-Instruct-2503” model converted into the GGUF format (Q4_K_L or Q4_K_M quantized version) for easy and efficient deployment via Ollama.

The original GGUF image was made available by bartowski on Hugging Face.

Model Features

  • Base Model: Mistral-Small-3.1-24B-Instruct (variant 2503)
  • Quantization: GGUF Q4_K_L quantization format, significantly reducing resource requirements while preserving high-quality inference.
  • Low VRAM Usage: Specifically optimized to run effectively on GPUs with limited VRAM resources of around 16GB.
  • Enhanced Reliability: Adjustable sampling temperature to ensure stable and predictable generation results.
  • Native Tool Calling Capability: Integrated support for tool calling functionality, enabling the model to seamlessly utilize external instructions and APIs.

Ideal Use Cases

This model is perfectly suited for:

  • Local inference on consumer-grade GPUs.
  • Integration in projects requiring advanced instructions and command-following tasks.
  • Environments where controlled inference quality and predictable responses are essential.
  • Leveraging tool calling capabilities for greater interactivity and functionality.

Acknowledgements

Special thanks to Mistral.ai for providing open access to high-quality foundational models, making the existence of derivatives such as this one possible. We also express gratitude to bartowski (on Hugging Face) for the initial GGUF quantization and distribution efforts.