DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
60.3K Pulls 3 Tags Updated 3 weeks ago
The TinyLlama project is an open endeavor to train a compact 1.1B Llama model on 3 trillion tokens.
3.2M Pulls 36 Tags Updated 1 year ago
A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.
123.6K Pulls 33 Tags Updated 1 year ago
OLMo 2 is a new family of 7B and 13B models trained on up to 5T tokens. These models are on par with or better than equivalently sized fully open models, and competitive with open-weight models such as Llama 3.1 on English academic benchmarks.
3.4M Pulls 9 Tags Updated 11 months ago
1,597 Pulls 1 Tag Updated 1 year ago
pulled from https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q4_K_M.gguf
158 Pulls 1 Tag Updated 1 year ago
50 Pulls 1 Tag Updated 1 year ago
A coding assistant with a 32k context window.
27 Pulls 4 Tags Updated 1 year ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
2.9M Pulls 5 Tags Updated 11 months ago
Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.
209.6K Pulls 17 Tags Updated 2 years ago
The long-context version of Qwen2.5, supporting 1M-token context lengths
1,325 Pulls 2 Tags Updated 9 months ago
tools support and a 128k context length by default
1,178 Pulls 1 Tag Updated 2 months ago
DeepCoder with tool calling(MCP) support
212 Pulls 1 Tag Updated 7 months ago
[This is a fixed version of Command R which REALLY support tool calls] Command R is a Large Language Model optimized for conversational interaction and long context tasks.
118 Pulls 1 Tag Updated 8 months ago
[This is a fixed version of Command A which REALLY support tool calls] 111 billion parameter model optimized for demanding enterprises that require fast, secure, and high-quality AI
34 Pulls 1 Tag Updated 8 months ago
https://huggingface.co/localfultonextractor/Erosumika-7B-GGUF
1,125 Pulls 4 Tags Updated 1 year ago
Adapted for Cline tool / Roo Code use in VS Code fused model , hybrid of DeepSeekR1 and Qwen2.5 coder, from FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview.
4,301 Pulls 2 Tags Updated 10 months ago
Typhoon-OCR - A document parsing model built for Thai and English
2,205 Pulls 1 Tag Updated 6 months ago
Phi-4 is a 14B parameter, state-of-the-art open model from Microsoft. Clone of `phi4` with a tool calling template.
1,479 Pulls 1 Tag Updated 10 months ago
1,473 Pulls 1 Tag Updated 5 months ago