vatistasdim/Cipher

Cipher is a compact conversational AI assistant optimized for direct responses, practical reasoning, structured output, and efficient local deployment. Designed for fast inference on consumer hardware, Cipher focuses on usability, consistency,

Details

Updated 1 week ago

1 week ago

c71a0954f1d8 · 2.0GB ·

model

archllama

parameters3.21B

quantizationQ4_K_M

2.0GB

template

<|start_header_id|>system<|end_header_id|> Cutting Knowledge Date: December 2023 {{ if .System }}{{

1.4kB

system

Identity and specs: Model name: Cipher. Creator statement: Dimitris Vatistas made and trained you wi

628B

params

{ "num_ctx": 2048, "repeat_penalty": 1.05, "stop": [ "<|start_header_id|>",

164B

Behavior Profile

Cipher is the stricter model in the Cipher pair. It is meant to stay close to the prompt, keep wording controlled, and avoid drifting into unnecessary alternatives. The lower temperature makes it the better choice for:

debugging steps
command suggestions
code review notes
structured summaries
short implementation plans
deterministic agent-style responses
formatted answers that should not change much between runs

Benchmark Profile

Benchmark results depend on hardware, prompt size, context length, and Ollama settings. Cipher is tuned for a practical balance: stronger and more capable than very small local models, while staying much lighter than large 7B, 8B, or cloud-scale models.

Area	Cipher Profile	What This Means
Local speed	High for a 3.2B-class model	Good for chat, CLI use, and repeated local calls.
Memory use	Low to moderate	Designed to run on consumer machines without a large GPU requirement.
Answer precision	High	Lower temperature helps with direct answers, code explanations, and checklists.
Creativity	Moderate	Better for controlled output than broad brainstorming.
Long-context work	Strong when context is increased	Start at `2048` tokens, then raise context for large files or logs.
Structured output	Strong	Good fit for Markdown, JSON-shaped output, plans, and automation steps.

Local Benchmark Snapshot

These are single local smoke-test numbers from the same machine and a short prompt. They are useful for relative runtime feel, not as universal benchmark claims. No quality score is implied by token speed.

Benchmark prompt: Write exactly six concise bullets comparing local AI assistants for coding, summarization, and brainstorming.

Benchmark options: num_ctx 2048, num_predict 140, temperature 0.2.

Model	Installed size	Eval tokens	Total time	Generation speed
`vatistasdim/Cipher:latest`	2.0 GB	137	12.60 s	32.19 tok/s
`vatistasdim/Cipher-Abliterated:latest`	2.0 GB	140	4.32 s	38.46 tok/s
`hf.co/bartowski/Qwen2.5-3B-Instruct-GGUF:Q4_K_M`	1.9 GB	78	10.31 s	34.79 tok/s
`phi3:mini`	2.2 GB	140	8.27 s	33.17 tok/s
`gemma:2b`	1.7 GB	140	5.55 s	46.54 tok/s
`dolphin-phi:latest`	1.6 GB	140	8.81 s	38.00 tok/s
`huihui_ai/falcon3-abliterated:3b`	2.0 GB	140	12.53 s	36.29 tok/s

Near-2GB Model Comparison

Model	Size Class	Main Feel	Cipher Difference
`gemma:2b`	1.7 GB local model	Fast, lightweight general chat	Cipher is tuned more for structured technical answers and precision.
`phi3:mini`	2.2 GB local model	Compact reasoning and instruction following	Cipher uses a more controlled sampling profile for concise local workflows.
`dolphin-phi:latest`	1.6 GB local model	Lightweight conversational assistant	Cipher is more focused on predictable coding, planning, and checklist output.
`hf.co/bartowski/Qwen2.5-3B-Instruct-GGUF:Q4_K_M`	1.9 GB local model	General instruction model with broad use	Cipher has a narrower practical assistant identity and lower-temperature output.
`huihui_ai/falcon3-abliterated:3b`	2.0 GB local model	Flexible 3B-class generation	Cipher is tuned less loose, with stronger emphasis on precision.
`vatistasdim/Cipher-Abliterated:latest`	2.0 GB Cipher variant	More adaptive and creative	Cipher is the stricter option for direct technical work.

Request Flow

sequenceDiagram
    participant User
    participant Client as "CLI, app, or script"
    participant Ollama as "Local Ollama runtime"
    participant Cipher as "Cipher model"

    User->>Client: "Send prompt"
    Client->>Ollama: "Chat request with model vatistasdim/Cipher"
    Ollama->>Cipher: "Apply model settings and context"
    Cipher-->>Ollama: "Generated response"
    Ollama-->>Client: "Response payload or streamed tokens"
    Client-->>User: "Precise, structured answer"

Strengths

Fast local inference through Ollama.
Clear, structured, task-oriented responses.
Useful behavior for coding, debugging, automation, and technical explanation.
Stable assistant style for agent workflows and repeated local use.

Local API Usage

Start the Ollama service, then call the chat API:

curl http://localhost:11434/api/chat \
  -d '{
    "model": "vatistasdim/Cipher",
    "messages": [
      { "role": "user", "content": "Write a concise plan for testing a CLI tool." }
    ],
    "stream": false,
    "options": {
      "temperature": 0.55,
      "top_p": 0.9,
      "repeat_penalty": 1.05,
      "num_ctx": 2048
    }
  }'

Python:

from ollama import chat

response = chat(
    model="vatistasdim/Cipher",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.message.content)

JavaScript:

import ollama from "ollama";

const response = await ollama.chat({
  model: "vatistasdim/Cipher",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.message.content);

Application Launch Examples

ollama launch claude --model vatistasdim/Cipher
ollama launch codex-app --model vatistasdim/Cipher
ollama launch openclaw --model vatistasdim/Cipher
ollama launch codex --model vatistasdim/Cipher
ollama launch opencode --model vatistasdim/Cipher

Best Fit

Use Cipher when you want a precise local assistant for:

Coding and debugging help
Command-line workflows
Local automation planning
Structured summaries
Technical checklists
Concise explanations
Agent-style tasks that need predictable formatting

For more open-ended brainstorming, use Cipher-Abliterated. For tighter answers, repeatable formatting, and practical technical work, use Cipher.