26 1 month ago

SmolLM-135M-GGUF quantized to Q4_0 GGUF for efficient inference.

ollama run schroneko/smollm-135m:q4_0

Details

1 month ago

6d10e3567a82 · 92MB ·

llama
·
135M
·
Q4_0

Readme

SmolLM-135M-GGUF

SmolLM-135M-GGUF is a language model converted to GGUF format for efficient on-device inference.

This is a GGUF Q4_0 quantized version of QuantFactory/SmolLM-135M-GGUF, converted using castkit.

Getting Started

ollama run schroneko/smollm-135m:q4_0

Quantization Details

Property Value
Format GGUF
Quantization Q4_0
Source QuantFactory/SmolLM-135M-GGUF

Key Features

  • GGUF format: Optimized for fast inference with llama.cpp and Ollama
  • Q4_0 quantization: Reduced memory footprint for on-device deployment
  • Ready to use: Run directly with ollama run schroneko/smollm-135m:q4_0

Source

License

Please refer to the original model card for license information.