KexityAI/kex

KexityAI/ kex

3 Downloads Updated yesterday

Kexity AI's first generation of flagship TLMs for efficient local inference.

tools thinking

ollama run KexityAI/kex

curl http://localhost:11434/api/chat \
  -d '{
    "model": "KexityAI/kex",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='KexityAI/kex',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'KexityAI/kex',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Applications

Claude Code ollama launch claude --model KexityAI/kex

OpenClaw ollama launch openclaw --model KexityAI/kex

Hermes Agent ollama launch hermes --model KexityAI/kex

Codex ollama launch codex --model KexityAI/kex

OpenCode ollama launch opencode --model KexityAI/kex

Models

View all →

Name

1 model

Size

Context

Input

kex:latest

397MB · 40K context window · Text · yesterday

kex:latest

397MB

40K

Text

Readme

NOTE: Kex has been succeeded by Kex 1.5. We suggest using that instead.

Kex is Kexity AI’s first generation of flagship TLMs for efficient local inference. Kex supports tool calling, thinking, and features token-efficient thinking and reasoning for compute-constrained environments.

Use Case

This model is for customers with extremely constrained compute or low-latency applications. Kex punches above it’s weight in agentic use-cases, and is useful for tasks such as the following:

Agents running on edge/IoT devices with less than 512 MB of RAM
Low-latency chatbots and agents for environments where speed matters