rubinmaximilian/Monk-Router-phi4mini

rubinmaximilian/ Monk-Router-phi4mini

2 Downloads Updated 20 hours ago

A specialized logic-router for the Monk AI ecosystem, built on the Phi-4 Mini (3.8B) reasoning engine. Optimized for high-fidelity JSON tool-calling and hardware-aware task routing between Jetson edge devices and high-VRAM GPU servers.

tools

ollama run rubinmaximilian/Monk-Router-phi4mini

curl http://localhost:11434/api/chat \
  -d '{
    "model": "rubinmaximilian/Monk-Router-phi4mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='rubinmaximilian/Monk-Router-phi4mini',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'rubinmaximilian/Monk-Router-phi4mini',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Applications

Claude Code

Claude Code ollama launch claude --model rubinmaximilian/Monk-Router-phi4mini

Codex

Codex ollama launch codex --model rubinmaximilian/Monk-Router-phi4mini

OpenCode

OpenCode ollama launch opencode --model rubinmaximilian/Monk-Router-phi4mini

OpenClaw

OpenClaw ollama launch openclaw --model rubinmaximilian/Monk-Router-phi4mini

Models

Name

1 model

Size

Context

Input

Monk-Router-phi4mini:latest

2.5GB · 128K context window · Text · 20 hours ago

Monk-Router-phi4mini:latest

2.5GB

128K

Text

Readme

`Monk-Router-phi4mini`

Monk-Router-phi4mini is a specialized logic gateway for the Monk AI ecosystem. It acts as the “Prefrontal Cortex” of the system, analyzing user requests to determine the most efficient hardware and model path.

Built on Microsoft’s Phi-4 Mini (3.8B), this model is specifically tuned for technical reasoning, complex code analysis, and strict JSON output.

For a similar model with a faster, lower-latency response, see my other model based on PHI4-mini and let me know if I should build an even larger model for scaled applications!

Key Features

Hardware-Aware: Intelligently routes tasks based on local Jetson Orin Nano constraints.
Precision Logic: High-fidelity decision making for complex tasks like security audits and large-file analysis.
Strict JSON: Guaranteed tool-call output for seamless integration with Python/C++ backends.

Tools & Routing Logic

The model evaluates incoming prompts and outputs a JSON command to: 1. switch_model: Swap local models (Gemma, Phi, Qwen). 2. set_server: Offload tasks to a Main PC GPU or Cloud API. 3. activate_swarm: Trigger multi-model agents (Research Squad, Code Review).

Usage example

”`bash ollama run rubinmaximilian/Monk-Router-phi4mini “Analyze this 500-line C++ file for memory leaks.”

### `Monk-Router-phi4mini`

**Monk-Router-phi4mini** is a specialized logic gateway for the **Monk AI** ecosystem. It acts as the "Prefrontal Cortex" of the system, analyzing user requests to determine the most efficient hardware and model path.

Built on Microsoft's **Phi-4 Mini (3.8B)**, this model is specifically tuned for technical reasoning, complex code analysis, and strict JSON output.

For a similar model with a faster, lower-latency response, see [my other model based on PHI4-mini](https://ollama.com/rubinmaximilian/Monk-Router-Gemma4e2b) and let me know if I should build an even larger model for scaled applications!

### Key Features
- **Hardware-Aware:** Intelligently routes tasks based on local Jetson Orin Nano constraints.
- **Precision Logic:** High-fidelity decision making for complex tasks like security audits and large-file analysis.
- **Strict JSON:** Guaranteed tool-call output for seamless integration with Python/C++ backends.

### Tools & Routing Logic
The model evaluates incoming prompts and outputs a JSON command to:
1. `switch_model`: Swap local models (Gemma, Phi, Qwen).
2. `set_server`: Offload tasks to a **Main PC GPU** or **Cloud API**.
3. `activate_swarm`: Trigger multi-model agents (Research Squad, Code Review).

### Usage example
```bash
ollama run rubinmaximilian/Monk-Router-phi4mini "Analyze this 500-line C++ file for memory leaks."

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)