A small language model that simulates a QuietStar like behaviour in a simplified agentic way.

ollama run trollek/thoughtstream:4b-v01-q6_K

curl http://localhost:11434/api/chat \
  -d '{
    "model": "trollek/thoughtstream:4b-v01-q6_K",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='trollek/thoughtstream:4b-v01-q6_K',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'trollek/thoughtstream:4b-v01-q6_K',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 2 years ago

2 years ago

1b7145223393 · 3.3GB ·

model

archllama

parameters3.96B

quantizationQ6_K

3.3GB

params

{ "num_ctx": 8192, "num_predict": 2048, "stop": [ "<|im_end|>", "<|im_st

135B

template

<|im_start|>user {{ .Prompt }}<|im_end|> <|im_start|>assistant

63B

Readme

ThoughtStream-4B-v0.1

This model is based on h2oai/h2o-danube3-4b-base and fine-tuned using LoRA+ and BAdam with LLama-Factory. It uses the ChatML template, without a system message, and was trained on the ThoughtfulAssistant-v01 dataset.

The idea is to abstract the thoughts away or into a thought bubble when chatting.

HF repo: trollek/ThoughtStream-4B-v0.1 Quants: mradermacher/ThoughtStream-4B-v0.1-GGUF