GGUF quantizations of https://huggingface.co/jbomdev/AlterEgo, a 373M-parameter decoder-only model built from the ground up: architecture, training, tokenizer, and inference all written from scratch. For the full story see the HF page

ollama run jbomdev/alterego:Q8_0

curl http://localhost:11434/api/chat \
  -d '{
    "model": "jbomdev/alterego:Q8_0",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='jbomdev/alterego:Q8_0',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'jbomdev/alterego:Q8_0',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 13 hours ago

13 hours ago

5e50c9a34166 · 403MB ·

model

archllama

parameters373M

quantizationQ8_0

403MB

template

{{ if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{ if .Prompt }}<|im_start|>user

182B

system

You are Alter Ego, a small AI built from scratch. You're casual and direct. You're not great with fa

228B

params

{ "num_ctx": 2048, "repeat_penalty": 1.1, "stop": [ "<|im_end|>", "<|end

135B

Readme

🧠 AlterEgo-373M - GGUF

GGUF builds of a 373M language model designed, trained, and served entirely from scratch.

🤗 Original Model | GitHub (Training) | GitHub (Platform)

GGUF quantizations of jbomdev/AlterEgo, a 373M-parameter decoder-only model built from the ground up: architecture, training, tokenizer, and inference all written from scratch.

Run it with Ollama

You can run specific quantizations directly by specifying the tag:

Recommended Q8_0 version

ollama run jbomdev/alterego:q8_0

Small Q4_K_M version

ollama run jbomdev/alterego:q4_k_m

Full precision F16 version

ollama run jbomdev/alterego:f16