10 1 month ago

Open-weight Brazilian Portuguese LLM trained from scratch on 1.6B tokens. 87.8M params, Llama-style with GQA. Validation perplexity 21.34. Apache 2.0. Base model.

4138cd6f8042 · 186B
Apache License 2.0
Maracatu-80M weights and tokenizer released under Apache 2.0.
Code: https://github.com/maracatu-ai/maracatu
Model card: https://huggingface.co/maracatu-ai/maracatu-80m