gurubot/ cage:600m-Q4_K_M

6 hours ago

A faq/chatbot model that can't hallucinate

tools thinking
ollama run gurubot/cage:600m-Q4_K_M

Details

3 weeks ago

5dfd06776fee · 484MB ·

qwen3
·
751M
·
Q4_K_M
{{- if .Messages }} {{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{
Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US
{ "repeat_penalty": 1, "stop": [ "<|im_start|>", "<|im_end|>" ], "te

Readme

image.png

CAGE: Constrained Answer Generation Engine

The Problem with AI Chatbots

Every AI chatbot can hallucinate. They can very confidently give incorrect information, and have the entire scope of the English language to do so. You can try to prompt-engineer around it. Add guardrails. Use RAG. But the model still can generate anything and when the output is free form text it becomes impossible to guarantee the output.

For customer support, this is dangerous because it can invent non-existent policies, procedures and give facts that are not true. This becomes more than just a mistake because companies can be held liable for promises made.

These aren’t hypothetical scenarios. Companies have faced lawsuits, payouts, and reputational damage because their chatbots hallucinated. Here are some cases:

Air Canada: A customer’s grandmother died and when he asked Air Canada’s chatbot about a bereavement discount on his fare the bot told him he could claim a refund within 90 days. This was false and Air Canada’s actual policy doesn’t allow this. When he tried to claim the refund as the chatbot instructed, Air Canada refused and the matter went to a tribunal. In the tribunal Air Canada tried to state that the chatbot was “a separate legal entity responsible for its own actions.” but the tribunal didn’t agree and ruled against them. They had to pay damages and tribunal fees.

The written reponse from the tribunal stated “It makes no difference whether the information comes from a static page or a chatbot.”

New York’s “MyCity” chatbot: This was launched to help small businesses with advice. It gave dangerously wrong legal guidance including telling shop owners they could refuse cash payments which violated the law, and told landlords they could reject tenants using housing vouchers which again was illegal. Businesses following the bot’s advice would be breaking the law.

Chevrolet dealer: A chat bot agreed to sell a \(75,000 Chevy Tahoe for \)1 after being persuaded via prompt injection

DPD: DPD’s chatbot was easily persuaded to start swearing at customers and criticising DPD.

As shown above “The AI made a mistake” may not be a valid defense and the company may be legally liable, or at least suffer from brand damage resulting from a wayward AI.

The CAGE Solution

CAGE removes the ability to hallucinate.

Instead of generating free-form responses, CAGE selects from a predetermined list of approved answers. The model outputs a placeholder token like {questionCardLockReason} and your code substitutes it with the full response.

The LLM’s job becomes simple: understand what the customer needs, then pick the right response from the menu. It allows it to leverage its natural language processing and intelligence whilst reining in its output.

example interaction:

Customer: "I forgot my password"
Model outputs: {answerResetPassword}
code sends as: "You can reset your password from the sign-in screen by tapping 'Forgot password"

The model can’t invent new facts. It can’t go off-script. It can only choose from the approved responses you gave it.

Benefits

Removed Hallucination Risk

The model cannot output information you haven’t pre-approved.

Easy Localization

Since the model works with placeholders you can easily have multiple mapping files, one for each language.

Consistent response style

Your responses are written by your team. The tone won’t drift or vary unpredictably. Your chatbot won’t tell people it is “chatGPT”.

Accurate responses

Responses that require accuracy such as website URLs and phone numbers etc.. are guaranteed to be correct, rather than the LLM outputting a plausible looking url that actually results in a 404 - Page not found when the user clicks it.

Prompt Injection resilient

The model can still only output the responses you’ve preconfigured.

Tiny model

Because this model is specialised for this task you can use a smaller model which allows deployment on systems with limited VRAM or no GPU whilst still being performant.

How It Works

In the system prompt add the mapping from each placeholder to the desired response, for example:

{greeting} : Welcome to BrightBank Support! How can I help today?
{answerCardLocked} : It looks like your card is currently locked. You can unlock it instantly in the app under Card Settings.
{questionTransactionDetails} : Which transaction are you referring to (merchant name, amount, and date)?
{statementICantHelpWithThat} : Sorry I can't help with that, please contact our support line at 555 686 8686
{urlBrightBankLogin} : https://app.brightbank.example/login

Always include a {statementICantHelpWithThat} response option as a fallback for when the model cannot understand the customer’s query.

The model is trained with placeholders that are {greeting}, {answer}, {question}, {statement}, and {url}. You can actually use any format you like but sticking to these patterns will probably perform better.

Why not just use a system prompt with a general model?

You can ask a model to behave like this purely via system prompt however it has to be a reasonably large model to do it well and it might still fail sometimes. Because CAGE is finetuned a much smaller model (as small as 0.6b!) can reliably output correctly as shown in our tests:

Size Prompt-Only CAGE
600M ❌ Thinking tags + freeform text ✅ Perfect format
1.7B ❌ Adds full text after placeholders ✅ Perfect format

Since these are small models they are best used for simple FAQs where not much intelligence is required in deciding which placeholder to use. Bigger models can be made available if there is enough demand.

The model leans towards always greeting the user, if you find it providing a greeting unnecessarily in your situation then you can preload the conversation history with an initial greeting i.e

{'role': 'user', 'content': 'Hi'},
{'role': 'assistant', 'content': '{greeting}'}

This will be fixed in future versions of this model.

example code for a single message

This is just a simple example of replacing placeholders in a single response.

import ollama
import re

response_table = {
    "{greeting}": "Welcome to BrightBank Support! How can I help today?",
    "{answerCardLocked}": "It looks like your card is currently locked. You can unlock it instantly in the app under Card Settings.",
    "{answerResetPassword}": "You can reset your password from the sign-in screen by tapping 'Forgot password.'",
    "{answerRefundTimeline}": "Refunds typically take 3–10 business days to appear.",
    "{questionTransactionDetails}": "Which transaction are you referring to (merchant name, amount, and date)?",
    "{statementICantHelpWithThat}": "Sorry, I can't help with that. Please contact support at 555-686-8686.",
    "{goodbye}": "Thanks for banking with BrightBank!",
    "{urlBrightBankLogin}" : "https://app.brightbank.example/login"
}

response_table_as_string = "\n".join(f"{k}:{v}" for k,v in response_table.items() )

system_prompt = f"""You are a customer support assistant. Your job is to understand what the customer needs and answer using only the appropriate placeholder.
Available placeholders:
{response_table_as_string}"""

response = ollama.chat(
    model='gurubot/cage:1.7b-Q4_K_M',
    messages=[
        {'role': 'system', 'content': system_prompt},
        {'role': 'user', 'content': 'Hi'},
        {'role': 'assistant', 'content': '{greeting}'},
        {'role': 'user', 'content': 'I forgot my password'}
    ],
    options={'temperature': 0.1}
)

placeholder_string = response['message']['content']

ph_re = re.compile("{.*?}")

placeholders = ph_re.findall(placeholder_string)

final_response = "".join( response_table.get(placeholder, f"full text for {placeholder} not found") for placeholder in placeholders)

print(final_response)

output of this:

You can reset your password from the sign-in screen by tapping 'Forgot password.'

Interactive chat example

This lets you chat interactively and see the placeholders being used

import ollama
import re

response_table = {
    "{greeting}": "Welcome to BrightBank Support! How can I help today?",
    "{answerCardLocked}": "It looks like your card is currently locked. You can unlock it instantly in the app under Card Settings.",
    "{answerResetPassword}": "You can reset your password from the sign-in screen by tapping 'Forgot password.'",
    "{answerRefundTimeline}": "Refunds typically take 3–10 business days to appear.",
    "{questionTransactionDetails}": "Which transaction are you referring to (merchant name, amount, and date)?",
    "{statementICantHelpWithThat}": "Sorry, I can't help with that. Please contact support at 555-686-8686.",
    "{goodbye}": "Thanks for banking with BrightBank!",
    "{urlBrightBankLogin}": "https://app.brightbank.example/login"
}

response_table_as_string = "\n".join(f"{k}:{v}" for k,v in response_table.items())

system_prompt = f"""You are a customer support assistant. Your job is to understand what the customer needs and answer using only the appropriate placeholder.
Available placeholders:
{response_table_as_string}"""

ph_re = re.compile("{.*?}")

def expand_placeholders(placeholder_string):
    """Convert placeholder string to human-readable text."""
    placeholders = ph_re.findall(placeholder_string)
    return "".join(response_table.get(placeholder, f"full text for {placeholder} not found") for placeholder in placeholders)

def display_response(placeholder_string):
    """Display the response with placeholders in brackets."""
    expanded = expand_placeholders(placeholder_string)
    placeholders = ph_re.findall(placeholder_string)
    placeholder_text = " ".join(f"[{p}]" for p in placeholders)
    return f"{expanded} {placeholder_text}"

messages = [
    {'role': 'system', 'content': system_prompt},
]

print("=== BrightBank Support Chat (Interactive) ===")
print("Type 'quit' or 'exit' to end the conversation\n")

while True:
    user_input = input("You: ").strip()
    
    if user_input.lower() in ['quit', 'exit']:
        print("\nEnding conversation. Goodbye!")
        break
    
    if not user_input:
        continue
    
    messages.append({'role': 'user', 'content': user_input})
    
    response = ollama.chat(
        model='gurubot/cage:1.7b-Q4_K_M',
        messages=messages,
        options={'temperature': 0.3}
    )
    
    placeholder_string = response['message']['content']
    
    display_text = display_response(placeholder_string)
    print(f"Assistant: {display_text}\n")
    
    messages.append({'role': 'assistant', 'content': placeholder_string})

example interactive output with the chosen placeholder shown in brackets:

You: hi
Assistant: Welcome to BrightBank Support! How can I help today? [{greeting}]

You: my passw not working
Assistant: You can reset your password from the sign-in screen by tapping 'Forgot password.' [{answerResetPassword}]

You: how can i get more info
Assistant: Which transaction are you referring to (merchant name, amount, and date)? [{questionTransactionDetails}]

You: u smell
Assistant: Sorry, I can't help with that. Please contact support at 555-686-8686. [{statementICantHelpWithThat}]

You: byee
Assistant: Thanks for banking with BrightBank! [{goodbye}]

Advanced

Dialog trees:

You can, in your code, alter the response table that is placed in the system prompt based upon the last response from the LLM. For example if the last question the LLM asked was “do you want to pay by cash, cheque or card?” then it could populate the response table with only responses that are relevant to that area. Using this you can create a dialog tree which can organise a larger set of answers in linked nodes rather than having them all in one long context. Keeping the number of placeholders in context at any one time at a reasonable number will give the LLM the best chance of picking the correct response.

Acronyms:

You don’t have to write out camel-case placeholder terms like {questionTransactionDetails}, you could for example use {QTD} and in testing this performed acceptably well, however the semantic meaning in writing full words is an extra hint to the LLM worth keeping. In well written code you should still only need to write out the full term once in the response table definition in any case.

Tool Calling:

While they cannot be parametrised in this version of the model you can use the placeholder output as a simple way of tool calling, for example detecting {goodbye} can be used as a trigger to deactivate the chat window or {urlBrightBankLogin} could automatically open the login page.