301 Downloads Updated 1 month ago
Name
17 models
granite-4.0:350m
366MB · 1M context window · Text · 1 month ago
granite-4.0:1b
1.6GB · 1M context window · Text · 1 month ago
granite-4.0:3.2b
1.9GB · 1M context window · Text · 2 months ago
granite-4.0:3.4b
2.0GB · 128K context window · Text · 2 months ago
granite-4.0:7b
4.0GB · 1M context window · Text · 2 months ago
Uploaded as Unsloth DynamicQuant2 (DQ2) versions. These are enterprise-focused models, built for tool-use and structured output over creative chat.
All models here are Q4_0. The unsloth dq2 method may preserve most of the performance of the q8 model, but doesn’t guarantee it. Do your own testing to make sure it really works for your use case.
Temperature: A range of 0.4–0.6 works well for their intended instruction-following and tool-use tasks.
IBM’s Granite 4.0 — a series of lightweight, open foundation models (Apache 2.0) designed for enterprise applications. They excel at tool-calling, structured JSON output, multilingual tasks, and fill-in-the-middle (FIM) code completion.
The key difference between these models is their architecture and context length:
Granite 4.0 Series:
granite-4.0:3.4b (micro): A dense model with a 128K context window.
granite-4.0:3.2b (h-micro): A dense-hybrid model with a 1M context window.
granite-4.0:7b (h-tiny): A larger dense-hybrid model, also with a 1M context window.
Granite 4.0 Nano Series (for on-device/IoT use):
granite-4.0:350m (nano): A dense model with a 128K context window.
granite-4.0:1b (nano): A dense model with a 128K context window.
granite-4.0:350m-h (h-nano): A dense-hybrid model with a 1M context window.
granite-4.0:1b-h (h-nano): A larger dense-hybrid model, also with a 1M context window.
Ideal for: - Agentic workflows requiring tool-calling - Structured data extraction into JSON - Code completion, especially Fill-in-the-Middle (FIM) - Enterprise RAG and other instruction-following tasks - On-device or IoT applications (Nano series)
Granite 4.0 HuggingFace Collection