5 Downloads Updated yesterday
ollama run aisingapore/Gemma-SEA-LION-v4-4B-VL
Updated yesterday
yesterday
c87febeacb02 · 3.3GB ·
[Last update: 2026-02-05]
SEA-LION is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
Gemma-SEA-LION-v4-4B-VL is a 4-billion parameter Vision-Language Model (VLM) built upon the gemma-3-4b-it architecture. To ensure domain adaptation for the region, the model underwent rigorous post-training on a curated dataset of approximately 6.7 million instruction-text pairs.
This extensive post-training instills multilingual and multicultural fluency, covering key SEA languages such as Indonesian, Vietnamese, Thai, Filipino, Tamil, Burmese, Malay. This curated dataset also includes a filtered open sourced set of tool-calling instruction-text pairs to impart these capabilities, in addition to linguistic fluency.
Gemma-SEA-LION-v4-4B-VL inherits the image and text capabilities from gemma-3-4b-it alongside its large context length of 128K tokens. Additionally, beyond extending the multilingual capabilities of the original gemma model for SEA languages, we experimented with:
SEA-LION stands for Southeast Asian Languages In One Network.
We performed Post-Training in English and SEA languages on gemma-3-4b-it, a decoder model using the gemma-3 architecture, to create Gemma-SEA-LION-v4-4B-VL.
For tokenization, the model employs the default tokenizer used in gemma-3-4b-it.
For more details, please refer to AI Singapore’s HuggingFace page for this model. The original GGUF files can be obtained from this HuggingFace repository.