Details

Updated 5 days ago

5 days ago

f5e13a33e759 · 1.9GB ·

model

archqwen35

parameters2.27B

quantizationQ4_K_M

1.9GB

license

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US

11kB

system

You are a coding agent running inside Hermes. CORE RULES: - Minimize creativity. - Maximize determin

731B

params

{ "num_ctx": 128000, "presence_penalty": 1.5, "repeat_last_n": 2048, "repeat_penalty

136B

Hermes79 - 2B param, Q4, 128K ctx, Local/Offline, 8GB GPU

Custom Ollama model, fine-tuned from Qwen3.5-2B, configured for Hermes Agent and local personal-assistant workflows.

This model is based on a 2B parameter LLM, quantized in Q4, and configured with a very large context window for long assistant sessions. It is intended for local AI assistant experiments where privacy, offline operation, tool use, and extended conversations are important.

Model details

Type: Text/image model
Size: 2B parameters
Quantization: Q4
Context target: 128K
Real GPU memory usage: 6,5 GB VRAM
Recommended GPU memory: 8 GB VRAM
Main focus: Personal assistant and AI agent workflows
Tool use: Supported, depending on the client/application
Thinking/reasoning mode: Supported, depending on the client/application

Intended use

This model is designed for:

Hermes Agent
Local personal assistant workflows
Long-context conversations
Tool-assisted tasks
Linux/Windows help
Network and computer diagnostics
Educational AI agent demonstrations
Offline/private assistant experiments

Custom model for Hermes to use locally with 8gb GPUs (expect no miracles...)

Details

Readme

Hermes79 - 2B param, Q4, 128K ctx, Local/Offline, 8GB GPU

Model details

Intended use