OLMo 2 is a new family of 7B and 13B models trained on up to 5T tokens. These models are on par with or better than equivalently sized fully open models, and competitive with open-weight models such as Llama 3.1 on English academic benchmarks.
2.6M Pulls 9 Tags Updated 8 months ago
Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta's Llama 2 models. The model is designed to excel particularly in reasoning.
77.3K Pulls 33 Tags Updated 1 year ago
A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.
2.5M Pulls 17 Tags Updated 1 month ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
2.2M Pulls 5 Tags Updated 8 months ago
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
1.1M Pulls 64 Tags Updated 1 year ago
OpenCoder is an open and reproducible code LLM family which includes 1.5B and 8B models, supporting chat in English and Chinese languages.
135.2K Pulls 9 Tags Updated 9 months ago
Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.
119.4K Pulls 67 Tags Updated 12 months ago
Open-source medical large language model adapted from Llama 2 to the medical domain.
88.6K Pulls 22 Tags Updated 1 year ago
An open large reasoning model for real-world solutions by the Alibaba International Digital Commerce Group (AIDC-AI).
50.2K Pulls 5 Tags Updated 9 months ago
Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.
34.1K Pulls 17 Tags Updated 1 year ago
A new state-of-the-art version of the lightweight Command R7B model that excels in advanced Arabic language capabilities for enterprises in the Middle East and Northern Africa.
17.5K Pulls 5 Tags Updated 6 months ago
The TinyLlama project is an open endeavor to train a compact 1.1B Llama model on 3 trillion tokens.
2.8M Pulls 36 Tags Updated 1 year ago
New state of the art 70B model. Llama 3.3 70B offers similar performance compared to the Llama 3.1 405B model.
2.4M Pulls 14 Tags Updated 9 months ago
Stable LM 2 is a state-of-the-art 1.6B and 12B parameter language model trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch.
128K Pulls 84 Tags Updated 1 year ago
An open weights function calling model based on Llama 3, competitive with GPT-4o function calling capabilities.
32.4K Pulls 17 Tags Updated 1 year ago
This is a 8 bit quantized version of a gemma3:27b fine tuned with YagCed/Aveva_PML_test on HF. If you're from Aveva and want this model to be removed from public view, please let me know.
55 Pulls 1 Tag Updated 3 months ago
A very small test: gemma3:1b fine tuned with a dataset obsessed with spiders. As a result, this model puts spiders in all its answers. This is useless: just a pet project to learn to generate a dataset and fine tune a small model.
35 Pulls 1 Tag Updated 4 months ago
Ticketeer is an experienced Agile Practitioner and Scrum Master.
9 Pulls 1 Tag Updated 4 months ago
Jin3.5-base based on Jin 3.5 max
4 Pulls 1 Tag Updated 4 days ago
Model made ideally for 1-on-1 roleplay, but one that is able to handle scenarios, RPGs and storywriting fine. One of more capable among 8b, uses reliable L3.
13.7K Pulls 1 Tag Updated 1 year ago