dcostenco/ prism-ide:14b

31 Downloads Updated 1 month ago

Local-first AI tool router + coder. 4 sizes. 100% routing accuracy. 22/22 coding eval. 97% free. Beats Opus.

14b 32b

ollama run dcostenco/prism-ide:14b

curl http://localhost:11434/api/chat \
  -d '{
    "model": "dcostenco/prism-ide:14b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='dcostenco/prism-ide:14b',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'dcostenco/prism-ide:14b',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 month ago

1 month ago

705df6d30842 · 9.0GB ·

model

archqwen3

·

parameters14.8B

·

quantizationQ4_K_M

9.0GB

params

{ "num_ctx": 16384, "num_predict": 512, "stop": [ "<|im_start|>", "<|im_

135B

template

{{ if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{ if .Prompt }}<|im_start|>user

156B

Readme

Prism IDE — Local-First AI Coding Assistant

Fine-tuned on Qwen3 for healthcare-grade TypeScript codebases. Matches Claude Sonnet 4.6 on a 22-check coding eval while running fully offline.

Models

Tag	Base	Size	Best for
`1.7b`	Qwen3-1.7B	1.1 GB	On-device / iOS
`8b`	Qwen3-8B	5.2 GB	Laptop / fast iteration
`14b`	Qwen3-14B	9.0 GB	Daily driver
`32b`	Qwen3-32B	20 GB	Highest quality

Performance

Metric	Score
Routing accuracy (BFCL)	100%
Coding eval (22 checks)	²²⁄₂₂
vs Claude Sonnet 4.6	Tied
vs Claude Opus 4	Beats on coding

Quick Start

”`bash ollama run dcostenco/prism-ide:14b What it knows TypeScript / Next.js App Router patterns Healthcare audit logging (withAudit, HIPAA non-blocking .then) Supabase RLS, UUID validation, JSONB safety General ledger (CO/PR entries, CAS patterns) Tool-call routing (picks the right model/tool for the task) Thinking suppression Qwen3 thinking is disabled by default in the Modelfile for deterministic, fast responses. Ideal for IDE autocomplete and agent pipelines.

Companion routing model Use dcostenco/prism-coder for pure tool/model routing (smaller, faster). Use prism-ide when you need code generation quality.

License MIT — weights derived from Qwen3 (Apache 2.0).

# Prism IDE — Local-First AI Coding Assistant

Fine-tuned on Qwen3 for healthcare-grade TypeScript codebases. Matches Claude Sonnet 4.6 on a 22-check coding eval while running fully offline.

## Models

| Tag | Base | Size | Best for |
|-----|------|------|----------|
| `1.7b` | Qwen3-1.7B | 1.1 GB | On-device / iOS |
| `8b` | Qwen3-8B | 5.2 GB | Laptop / fast iteration |
| `14b` | Qwen3-14B | 9.0 GB | Daily driver |
| `32b` | Qwen3-32B | 20 GB | Highest quality |

## Performance

| Metric | Score |
|--------|-------|
| Routing accuracy (BFCL) | 100% |
| Coding eval (22 checks) | 22/22 |
| vs Claude Sonnet 4.6 | Tied |
| vs Claude Opus 4 | Beats on coding |

## Quick Start

```bash
ollama run dcostenco/prism-ide:14b
What it knows
TypeScript / Next.js App Router patterns
Healthcare audit logging (withAudit, HIPAA non-blocking .then)
Supabase RLS, UUID validation, JSONB safety
General ledger (CO/PR entries, CAS patterns)
Tool-call routing (picks the right model/tool for the task)
Thinking suppression
Qwen3 thinking is disabled by default in the Modelfile for deterministic, fast responses. Ideal for IDE autocomplete and agent pipelines.

Companion routing model
Use dcostenco/prism-coder for pure tool/model routing (smaller, faster). Use prism-ide when you need code generation quality.

License
MIT — weights derived from Qwen3 (Apache 2.0).

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)