The Mixtral-8x22B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts.

ollama run emsi/mixtral-8x22b

curl http://localhost:11434/api/chat \
  -d '{
    "model": "emsi/mixtral-8x22b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='emsi/mixtral-8x22b',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'emsi/mixtral-8x22b',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Models

View all →

Name

3 models

Size

Context

Input

mixtral-8x22b:latest

80GB · 64K context window · Text · 1 year ago

mixtral-8x22b:latest

80GB

64K

Text

mixtral-8x22b:q4_0

80GB · 64K context window · Text · 1 year ago

mixtral-8x22b:q4_0

80GB

64K

Text

mixtral-8x22b:fp16

281GB · 64K context window · Text · 1 year ago

mixtral-8x22b:fp16

281GB

64K

Text

Readme

Mixtral-8x22b

The Mixtral-8x22B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts.

4bit quantization fits on 80GB A100

Converted from https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1

Keep in mind that this is foundation model, so prompt it accordingly. Rather than: Write me a function in typescript that takes two numbers and multiplies them try something like:

/**
 * This function takes two numbers and multiplies them
 * @param arg1 number
 * @param arg2 number
 * @returns number
 */
export function

(example taken from: https://www.reddit.com/r/LocalLLaMA/comments/1c0tdsb/mixtral_8x22b_benchmarks_awesome_performance/)