27 3 weeks ago

Using 4096 tokens for flash attention context window to work as intended. Trying a new template and system prompt to see how it reacts.

tools

3 weeks ago

77299afe1734 · 5.2GB

qwen3
·
8.19B
·
Q4_K_M
MIT License Copyright (c) 2023 DeepSeek Permission is hereby granted, free of charge, to any person
<|im_start|>system {{- if .System }}{{ .System }}{{ else }}You are a concise, accurate coding assist
# Advanced Coding Assistant (ReAct + RAG, tool-calling, no chain-of-thought) <modes> <direct>Answer
{ "num_ctx": 4096, "repeat_penalty": 1.15, "seed": 42, "stop": [ "<|start_he

Readme

No readme