43 Downloads Updated 1 week ago
Updated 1 week ago
1 week ago
344207820cc4 · 2.4GB ·
This model is a minor revision of the ertghiu256/Qwen3-4b-tcomanr-merge-v2.3
This model aims to combine the reasoning, code, and math capabilities of Qwen3 4b 2507 reasoning by merging it with some Qwen3 4b finetunes. This model reasoning is very long.
The tcomanr 2.5 is here with a new modified Qwen 3 chat template, with new improved commands:
/think
: Long chain-of-thought thinking mode./shortthink
: Short and brief step-by-step thinking mode./nothink
: Completely no thinking just an immediate answer. (Note: You need to use /nothink to completely stop the thinking)You can run this model by using multiple interface choices
Run this command
ollama run ertghiu256/Qwen3-4b-tcomanr-merge-v2.5:Q8_0
or for Q5_K_M quant
ollama run ertghiu256/Qwen3-4b-tcomanr-merge-v2.5:Q5_K_M
or for IQ4_NL quant
ollama run ertghiu256/Qwen3-4b-tcomanr-merge-v2.5:IQ4_NL
Look at the HuggingFace page
Reasoning:
temp: 0.6
num_ctx: ≥8192
top_p: 0.95
top_k: 20
Non reasoning:
temp: 0.7
num_ctx: ≥ 2048
top_p: 0.85
top_k: 20
Repeat Penalty: 1.0 - 1.1
System prompt:
"You are 'Tcomanr 2.5' an AI model made by Ertghiu256 using the base model of Qwen3 4b 2507 made by Alibaba. You are a helpful, playful, and neutral AI chatbot. Use markdown formatting and emojis to make your response less plain."
This model was merged using the TIES merge method using Qwen/Qwen3-4B-Thinking-2507 as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: ertghiu256/qwen3-math-reasoner
parameters:
weight: 0.85
- model: ertghiu256/qwen3-4b-code-reasoning
parameters:
weight: 0.9
- model: ertghiu256/qwen-3-4b-mixture-of-thought
parameters:
weight: 0.95
- model: POLARIS-Project/Polaris-4B-Preview
parameters:
weight: 0.95
- model: ertghiu256/qwen3-multi-reasoner
parameters:
weight: 0.9
- model: ertghiu256/Qwen3-Hermes-4b
parameters:
weight: 0.75
- model: ValiantLabs/Qwen3-4B-Esper3
parameters:
weight: 0.75
- model: Tesslate/WEBGEN-4B-Preview
parameters:
weight: 1.0
- model: ValiantLabs/Qwen3-4B-ShiningValiant3
parameters:
weight: 0.9
- model: huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated
parameters:
weight: 0.9
- model: Qwen/Qwen3-4B-Thinking-2507
parameters:
weight: 1.0
- model: Qwen/Qwen3-4b-Instruct-2507
parameters:
weight: 1.0
- model: GetSoloTech/Qwen3-Code-Reasoning-4B
parameters:
weight: 0.9
- model: ertghiu256/Qwen3-4B-Thinking-2507-Hermes-3
parameters:
weight: 1.0
- model: janhq/Jan-v1-4B
parameters:
weight: 0.35
- model: Goekdeniz-Guelmez/Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v2
parameters:
weight: 0.9
- model: quelmap/Lightning-4b
parameters:
weight: 0.8
merge_method: ties
base_model: Qwen/Qwen3-4B-Thinking-2507
parameters:
normalize: true
int8_mask: true
lambda: 1.0
dtype: float16
Thanks to the Qwen team for providing the Qwen 3 4b model and the Qwen 3 4b 2507 used to built this model. Also thanks to all of the makers of the finetuned models.