56 1 week ago

Granite-4.0-H-350M is a lightweight instruct model finetuned from Granite-4.0-H-350M-Base using a combination of open-source instruction datasets with permissive license and internally collected synthetic datasets.

tools

1 week ago

80a1e314bdab · 253MB ·

granitehybrid
·
340M
·
Q5_K_M
{{- /* ------ MESSAGE PARSING ------ */}} {{- /* Declare the system prompt chunks used for different

Readme

Granite-4.0-H-350M-GGUF

granite 4.0.png

Modelfile Checkpoint **Checkpoint:** 11/30/2025 **Author:** Jewelzufo, *J.A.G* **Update:** *v1.1* **Links:** - [GitHub](https://github.com/Jewelzufo) - [TechXchange](https://community.ibm.com/community/user//expert/juliangonzalez) - [LinkedIn](https://www.linkedin.com/in/julian-g-7b533129a)

About

This Modelfile checkpoint documents the configuration and lineage of the model.
It is intended for reproducibility, sharing, and reference across deployments.


Model Summary

Granite-4.0-H-350M is a lightweight instruct model finetuned from Granite-4.0-H-350M-Base using a combination of open-source instruction datasets with permissive license and internally collected synthetic datasets. This model is developed using a diverse set of techniques including supervised finetuning, reinforcement learning, and model merging.

Supported Languages

English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. Users may fine-tune Granite 4.0 Nano models to support languages beyond those included in this list.

Intended use

Granite 4.0 Nano instruct models feature strong instruction following capabilities bringing advanced AI capabilities within reach for on-device deployments and research use cases. Additionally, their compact size makes them well-suited for fine-tuning on specialized domains without requiring massive compute resources.

Capabilities

  • Summarization
  • Text classification
  • Text extraction
  • Question-answering
  • Retrieval Augmented Generation (RAG)
  • Code related tasks
  • Function-calling tasks
  • Multilingual dialog use cases
  • Fill-In-the-Middle (FIM) code completions
  • Generation: This is a simple example of how to use Granite-4.0-H-350M model.

Model Architecture

Granite-4.0-H-350M baseline is based on a decoder-only dense transformer architecture. Core components of this architecture are: GQA, Mamba2, MLP with SwiGLU, RMSNorm, and shared input/output embeddings.

Ethical Considerations and Limitations

Granite 4.0 Nano Instruct Models are primarily finetuned using instruction-response pairs mostly in English, but also multilingual data covering multiple languages. Although this model can handle multilingual dialog use cases, its performance might not be similar to English tasks. In such case, introducing a small number of examples (few-shot) can help the model in generating more accurate outputs. While this model has been aligned by keeping safety in consideration, the model may in some cases produce inaccurate, biased, or unsafe responses to user prompts. So we urge the community to use this model with proper safety testing and tuning tailored for their specific tasks.


Developers: Granite Team, IBM

Function Calling: IBM Docs

GitHub Repository: Granite 4.0 Nano

Website: Granite Docs

Research: Granite Research

Release Date: October 28, 2025

License: Apache 2.0