library

llama3.1

Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.

Tools 8B 70B 405B

7.1M Pulls 94 Tags Updated 5 weeks ago

llama3

Meta Llama 3: The most capable openly available LLM to date

8B 70B

6.5M Pulls 68 Tags Updated 5 months ago

mistral

The 7B model released by Mistral AI, updated to version 0.3.

Tools 7B

4.2M Pulls 84 Tags Updated 5 months ago

gemma

Gemma is a family of lightweight, state-of-the-art open models built by Google DeepMind. Updated to version 1.1

2B 7B

4.1M Pulls 102 Tags Updated 6 months ago

qwen

Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters

0.5B 1.8B 4B 32B 72B 110B

4.1M Pulls 379 Tags Updated 4 months ago

qwen2

Qwen2 is a new series of large language models from Alibaba group

Tools 0.5B 1.5B 7B 72B

3.9M Pulls 97 Tags Updated 4 months ago

phi3

Phi-3 is a family of lightweight 3B (Mini) and 14B (Medium) state-of-the-art open models by Microsoft.

3B 14B

2.6M Pulls 72 Tags Updated 4 months ago

llama2

Llama 2 is a collection of foundation language models ranging from 7B to 70B parameters.

7B 13B 70B

2.2M Pulls 102 Tags Updated 8 months ago

gemma2

Google Gemma 2 is a high-performing and efficient model available in three sizes: 2B, 9B, and 27B.

2B 9B 27B

1.6M Pulls 94 Tags Updated 5 weeks ago

llava

🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.

Vision 7B 13B 34B

1.6M Pulls 98 Tags Updated 8 months ago

llama3.2

Meta's Llama 3.2 goes small with 1B and 3B models.

Tools 1B 3B

1.5M Pulls 63 Tags Updated 3 weeks ago

codellama

A large language model that can use text prompts to generate and discuss code.

Code 7B 13B 34B 70B

1.4M Pulls 199 Tags Updated 5 months ago

nomic-embed-text

A high-performing open embedding model with a large token context window.

Embedding

1.1M Pulls 3 Tags Updated 7 months ago

qwen2.5

Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.

Tools 0.5B 1.5B 3B 7B 14B 32B 72B

900.2K Pulls 133 Tags Updated 4 weeks ago

mxbai-embed-large

State-of-the-art large embedding model from mixedbread.ai

Embedding

495.2K Pulls 4 Tags Updated 6 months ago

mixtral

A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.

Tools 8x7B 8x22B

459K Pulls 69 Tags Updated 6 months ago

dolphin-mixtral

Uncensored, 8x7b and 8x22b fine-tuned models based on the Mixtral mixture of experts models that excels at coding tasks. Created by Eric Hartford.

8x7B 8x22B

418.9K Pulls 87 Tags Updated 5 months ago

mistral-nemo

A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.

Tools 12B

406.6K Pulls 17 Tags Updated 4 weeks ago

starcoder2

StarCoder2 is the next generation of transparently trained open code LLMs that comes in three sizes: 3B, 7B and 15B parameters.

Code 3B 7B

394.6K Pulls 67 Tags Updated 6 weeks ago

deepseek-coder-v2

An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.

Code 16B 236B

367.6K Pulls 65 Tags Updated 4 months ago

phi

Phi-2: a 2.7B language model by Microsoft Research that demonstrates outstanding reasoning and language understanding capabilities.

3B

358K Pulls 18 Tags Updated 8 months ago

deepseek-coder

DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.

Code 1B 7B 33B

351.4K Pulls 102 Tags Updated 10 months ago

llama2-uncensored

Uncensored Llama 2 model by George Sung and Jarrad Hope.

7B

337.6K Pulls 34 Tags Updated 11 months ago

codegemma

CodeGemma is a collection of powerful, lightweight models that can perform a variety of coding tasks like fill-in-the-middle code completion, code generation, natural language understanding, mathematical reasoning, and instruction following.

Code 2B 7B

331.7K Pulls 85 Tags Updated 6 months ago

dolphin-mistral

The uncensored Dolphin model based on Mistral that excels at coding tasks. Updated to version 2.8.

7B

246.8K Pulls 120 Tags Updated 6 months ago

qwen2.5-coder

The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.

Tools 1.5B 7B

242.6K Pulls 67 Tags Updated 4 weeks ago

command-r

Command R is a Large Language Model optimized for conversational interaction and long context tasks.

Tools 35B

231.7K Pulls 32 Tags Updated 7 weeks ago

yi

Yi 1.5 is a high-performing, bilingual language model.

6B 9B 34B

227.5K Pulls 174 Tags Updated 5 months ago

dolphin-llama3

Dolphin 2.9 is a new model with 8B and 70B sizes by Eric Hartford based on Llama 3 that has a variety of instruction, conversational, and coding skills.

8B 70B

225.1K Pulls 53 Tags Updated 6 weeks ago

orca-mini

A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.

3B 7B 13B

221.1K Pulls 119 Tags Updated 11 months ago

zephyr

Zephyr is a series of fine-tuned versions of the Mistral and Mixtral models that are trained to act as helpful assistants.

7B 8x22B

218.7K Pulls 40 Tags Updated 6 months ago

llava-llama3

A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.

Vision 8B

190.7K Pulls 4 Tags Updated 5 months ago

snowflake-arctic-embed

A suite of text embedding models by Snowflake, optimized for performance.

Embedding 22M 33M

178.7K Pulls 16 Tags Updated 6 months ago

tinyllama

The TinyLlama project is an open endeavor to train a compact 1.1B Llama model on 3 trillion tokens.

1B

165.8K Pulls 36 Tags Updated 9 months ago

mistral-openorca

Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.

7B

158K Pulls 17 Tags Updated 12 months ago

starcoder

StarCoder is a code generation model trained on 80+ programming languages.

Code 1B 3B 7B 15B

155.9K Pulls 100 Tags Updated 12 months ago

vicuna

General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.

7B 13B 30B

149K Pulls 111 Tags Updated 11 months ago

codestral

Codestral is Mistral AI’s first-ever code model designed for code generation tasks.

Code 22B

148.7K Pulls 17 Tags Updated 7 weeks ago

granite-code

A family of open foundation models by IBM for Code Intelligence

Code 3B 8B 20B 34B

133.8K Pulls 162 Tags Updated 7 weeks ago

llama2-chinese

Llama 2 based model fine tuned to improve Chinese dialogue ability.

7B 13B

133.3K Pulls 35 Tags Updated 12 months ago

wizard-vicuna-uncensored

Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.

7B 13B 30B

131.1K Pulls 49 Tags Updated 11 months ago

phi3.5

A lightweight AI model with 3.8 billion parameters with performance overtaking similarly and larger sized models.

3B

127.3K Pulls 17 Tags Updated 2 months ago

wizardlm2

State of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases.

7B 8x22B

122.9K Pulls 22 Tags Updated 6 months ago

codegeex4

A versatile model for AI software development scenarios, including code completion.

Code 9B

120.3K Pulls 17 Tags Updated 3 months ago

all-minilm

Embedding models on very large sentence level datasets.

Embedding 22M 33M

113.7K Pulls 10 Tags Updated 8 months ago

nous-hermes2

The powerful family of models by Nous Research that excels at scientific discussion and coding tasks.

34B

112K Pulls 33 Tags Updated 9 months ago

openchat

A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks. Updated to version 3.5-0106.

7B

110.5K Pulls 50 Tags Updated 9 months ago

aya

Aya 23, released by Cohere, is a new family of state-of-the-art, multilingual models that support 23 languages.

8B 35B

109K Pulls 33 Tags Updated 7 weeks ago

codeqwen

CodeQwen1.5 is a large language model pretrained on a large amount of code data.

Code 7B

108.4K Pulls 30 Tags Updated 6 months ago

tinydolphin

An experimental 1.1B parameter model trained on the new Dolphin 2.8 dataset by Eric Hartford and based on TinyLlama.

1B

102.7K Pulls 18 Tags Updated 9 months ago

command-r-plus

Command R+ is a powerful, scalable large language model purpose-built to excel at real-world enterprise use cases.

Tools 104B

101.1K Pulls 21 Tags Updated 7 weeks ago

wizardcoder

State-of-the-art code generation model

Code 7B 13B 33B 34B

100.4K Pulls 67 Tags Updated 9 months ago

stable-code

Stable Code 3B is a coding model with instruct and code completion variants on par with models such as Code Llama 7B that are 2.5x larger.

Code

97.5K Pulls 36 Tags Updated 7 months ago

openhermes

OpenHermes 2.5 is a 7B model fine-tuned by Teknium on Mistral with fully open datasets.

7B

96.6K Pulls 35 Tags Updated 9 months ago

mistral-large

Mistral Large 2 is Mistral's new flagship model that is significantly more capable in code generation, mathematics, and reasoning with 128k context window and support for dozens of languages.

Tools 123B

95.3K Pulls 17 Tags Updated 3 months ago

qwen2-math

Qwen2 Math is a series of specialized math language models built upon the Qwen2 LLMs, which significantly outperforms the mathematical capabilities of open-source models and even closed-source models (e.g., GPT4o).

1.5B 7B 72B

94.2K Pulls 52 Tags Updated 7 weeks ago

reflection

A high-performing model trained with a new technique called Reflection-tuning that teaches a LLM to detect mistakes in its reasoning and correct course.

70B

93.6K Pulls 17 Tags Updated 6 weeks ago

bakllava

BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.

Vision 7B

91.4K Pulls 17 Tags Updated 10 months ago

stablelm2

Stable LM 2 is a state-of-the-art 1.6B and 12B parameter language model trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch.

1.6B 12B

90.8K Pulls 84 Tags Updated 5 months ago

llama3-gradient

This model extends LLama-3 8B's context length from 8k to over 1m tokens.

8B 70B

86.8K Pulls 35 Tags Updated 5 months ago

deepseek-llm

An advanced language model crafted with 2 trillion bilingual tokens.

7B 67B

85.9K Pulls 64 Tags Updated 10 months ago

wizard-math

Model focused on math and logic problems

7B 13B

84.9K Pulls 64 Tags Updated 10 months ago

glm4

A strong multi-lingual general language model with competitive performance to Llama 3.

9B

82.5K Pulls 32 Tags Updated 3 months ago

neural-chat

A fine-tuned model based on Mistral with good coverage of domain and language.

7B

78.3K Pulls 50 Tags Updated 6 months ago

llama3-chatqa

A model from NVIDIA based on Llama 3 that excels at conversational question answering (QA) and retrieval-augmented generation (RAG).

8B 70B

74.1K Pulls 35 Tags Updated 5 months ago

moondream

moondream2 is a small vision language model designed to run efficiently on edge devices.

Vision

73.9K Pulls 18 Tags Updated 5 months ago

xwinlm

Conversational model based on Llama 2 that performs competitively on various benchmarks.

7B 13B

73.1K Pulls 80 Tags Updated 11 months ago

smollm

🪐 A family of small models with 135M, 360M, and 1.7B parameters, trained on a new high-quality dataset.

72.3K Pulls 94 Tags Updated 2 months ago

nous-hermes

General use models based on Llama and Llama 2 from Nous Research.

7B 13B

71.8K Pulls 63 Tags Updated 11 months ago

sqlcoder

SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks

Code 7B 15B 70B

71.4K Pulls 48 Tags Updated 11 months ago

phind-codellama

Code generation model based on Code Llama.

Code 34B

71.2K Pulls 49 Tags Updated 10 months ago

yarn-llama2

An extension of Llama 2 that supports a context of up to 128k tokens.

7B 13B

68.6K Pulls 67 Tags Updated 11 months ago

dolphincoder

A 7B and 15B uncensored variant of the Dolphin model family that excels at coding, based on StarCoder2.

Code 7B

68.3K Pulls 35 Tags Updated 6 months ago

wizardlm

General use model based on Llama 2.

7B 13B 30B

67K Pulls 73 Tags Updated 6 months ago

deepseek-v2

A strong, economical, and efficient Mixture-of-Experts language model.

62.5K Pulls 34 Tags Updated 7 weeks ago

starling-lm

Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.

7B

59.5K Pulls 36 Tags Updated 10 months ago

samantha-mistral

A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.

7B

57.7K Pulls 49 Tags Updated 12 months ago

falcon

solar

A compact, yet powerful 10.7B large language model designed for single-turn conversation.

55.4K Pulls 32 Tags Updated 10 months ago

orca2

Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta's Llama 2 models. The model is designed to excel particularly in reasoning.

7B 13B

54.5K Pulls 33 Tags Updated 11 months ago

stable-beluga

Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.

7B 13B

51.5K Pulls 49 Tags Updated 11 months ago

yi-coder

Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.

Code 1B 9B

49.6K Pulls 67 Tags Updated 6 weeks ago

hermes3

Hermes 3 is the latest version of the flagship Hermes series of LLMs by Nous Research

Tools 8B 70B 405B

48.7K Pulls 49 Tags Updated 8 weeks ago

internlm2

InternLM2.5 is a 7B parameter model tailored for practical scenarios with outstanding reasoning capability.

7B

48.1K Pulls 65 Tags Updated 3 months ago

dolphin-phi

2.7B uncensored Dolphin model by Eric Hartford, based on the Phi language model by Microsoft Research.

3B

46.8K Pulls 15 Tags Updated 10 months ago

llava-phi3

A new small LLaVA model fine-tuned from Phi 3 Mini.

Vision 3B

45.2K Pulls 4 Tags Updated 5 months ago

wizardlm-uncensored

Uncensored version of Wizard LM model

13B

44K Pulls 18 Tags Updated 14 months ago

yarn-mistral

An extension of Mistral to support context windows of 64K or 128K.

7B

39.8K Pulls 33 Tags Updated 10 months ago

llama-pro

An expansion of Llama 2 that specializes in integrating both general language understanding and domain-specific knowledge, particularly in programming and mathematics.

8B

39.3K Pulls 33 Tags Updated 9 months ago

medllama2

Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.

7B

37.2K Pulls 17 Tags Updated 12 months ago

meditron

Open-source medical large language model adapted from Llama 2 to the medical domain.

7B 70B

36.2K Pulls 22 Tags Updated 10 months ago

nexusraven

Nexus Raven is a 13B instruction tuned model for function calling tasks.

13B

35.9K Pulls 32 Tags Updated 10 months ago

nous-hermes2-mixtral

The Nous Hermes 2 model from Nous Research, now trained over Mixtral.

8x7B

33.6K Pulls 18 Tags Updated 9 months ago

llama3-groq-tool-use

A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.

Tools 8B 70B

32.5K Pulls 33 Tags Updated 3 months ago

codeup

Great code generation model based on Llama2.

Code 13B

32K Pulls 19 Tags Updated 11 months ago

everythinglm

Uncensored Llama2 based model with support for a 16K context window.

13B

30.1K Pulls 18 Tags Updated 10 months ago

minicpm-v

A series of multimodal LLMs (MLLMs) designed for vision-language understanding.

Vision 7B

30K Pulls 17 Tags Updated 6 weeks ago

mistral-small

Mistral Small is a lightweight model designed for cost-effective use in tasks like translation and summarization.

Tools 22B

28.4K Pulls 17 Tags Updated 5 weeks ago

magicoder

🎩 Magicoder is a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.

Code 7B

27.4K Pulls 18 Tags Updated 10 months ago

stablelm-zephyr

A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.

26.7K Pulls 17 Tags Updated 10 months ago

codebooga

A high-performing code instruct model created by merging two existing code models.

Code 34B

26.3K Pulls 16 Tags Updated 11 months ago

wizard-vicuna

Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.

13B

25.6K Pulls 17 Tags Updated 12 months ago

mistrallite

MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.

7B

24.9K Pulls 17 Tags Updated 11 months ago

falcon2

Falcon2 is an 11B parameters causal decoder-only model built by TII and trained over 5T tokens.

11B

24.8K Pulls 17 Tags Updated 5 months ago

duckdb-nsql

7B parameter text-to-SQL model made by MotherDuck and Numbers Station.

Code 7B

23.4K Pulls 17 Tags Updated 9 months ago

nemotron-mini

A commercial-friendly small language model by NVIDIA optimized for roleplay, RAG QA, and function calling.

Tools

23.2K Pulls 17 Tags Updated 4 weeks ago

bge-m3

BGE-M3 is a new model from BAAI distinguished for its versatility in Multi-Functionality, Multi-Linguality, and Multi-Granularity.

Embedding

22.8K Pulls 3 Tags Updated 2 months ago

megadolphin

MegaDolphin-2.2-120b is a transformation of Dolphin-2.2-70b created by interleaving the model with itself.

21.9K Pulls 19 Tags Updated 9 months ago

notux

A top-performing mixture of experts model, fine-tuned with high-quality data.

8x7B

20.9K Pulls 18 Tags Updated 9 months ago

open-orca-platypus2

Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.

13B

20.3K Pulls 17 Tags Updated 14 months ago

goliath

A language model created by combining two fine-tuned Llama 2 70B models into one.

20.2K Pulls 16 Tags Updated 11 months ago

notus

A 7B chat model fine-tuned with high-quality data and based on Zephyr.

7B

20.2K Pulls 18 Tags Updated 9 months ago

mathstral

MathΣtral: a 7B model designed for math reasoning and scientific discovery by Mistral AI.

7B

18.8K Pulls 17 Tags Updated 3 months ago

solar-pro

Solar Pro Preview: an advanced large language model (LLM) with 22 billion parameters designed to fit into a single GPU

22B

17.2K Pulls 18 Tags Updated 4 weeks ago

dbrx

DBRX is an open, general-purpose LLM created by Databricks.

132B

15.7K Pulls 7 Tags Updated 6 months ago

nemotron

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.

Tools 70B

15.3K Pulls 17 Tags Updated 6 days ago

nuextract

A 3.8B model fine-tuned on a private high-quality synthetic dataset for information extraction, based on Phi-3.

3B

14.9K Pulls 17 Tags Updated 2 months ago

reader-lm

A series of models that convert HTML content to Markdown content, which is useful for content conversion tasks.

0.5B 1.5B

13.8K Pulls 33 Tags Updated 5 weeks ago

firefunction-v2

An open weights function calling model based on Llama 3, competitive with GPT-4o function calling capabilities.

Tools 70B

12.4K Pulls 17 Tags Updated 3 months ago

alfred

A robust conversational model designed to be used for both chat and instruct use cases.

12.4K Pulls 7 Tags Updated 11 months ago

bge-large

Embedding model from BAAI mapping texts to vectors.

Embedding

8,489 Pulls 3 Tags Updated 2 months ago

deepseek-v2.5

An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.

Code 236B

7,203 Pulls 7 Tags Updated 6 weeks ago

bespoke-minicheck

A state-of-the-art fact-checking model developed by Bespoke Labs.

7B

7,110 Pulls 17 Tags Updated 4 weeks ago

paraphrase-multilingual

Sentence-transformers model that can be used for tasks like clustering or semantic search.

Embedding

5,157 Pulls 3 Tags Updated 2 months ago

granite3-dense

The IBM Granite 2B and 8B models are designed to support tool-based use cases and support for retrieval augmented generation (RAG), streamlining code generation, translation and bug fixing.

Tools

4,868 Pulls 33 Tags Updated yesterday

shieldgemma

ShieldGemma is set of instruction tuned models for evaluating the safety of text prompt input and text output responses against a set of defined safety policies.

2B 9B 27B

4,052 Pulls 49 Tags Updated 12 days ago

llama-guard3

Llama Guard 3 is a series of models fine-tuned for content safety classification of LLM inputs and responses.

1B 8B

3,651 Pulls 33 Tags Updated 12 days ago

granite3-moe

The IBM Granite 1B and 3B models are the first mixture of experts (MoE) Granite models from IBM designed for low latency usage.

Tools

2,854 Pulls 33 Tags Updated yesterday