57 3 weeks ago

Instruct version of the large language model YandexGPT 5 Lite with 8B parameters with a context length of 32k tokens. (quantised version of Q5_K_M)

8b

3 weeks ago

4788a8871969 · 4.9GB ·

llama
·
8.04B
·
Q4_K_M
Лицензионное соглашение YandexGPT-5-Lite-8B Настоящее лицензи
<s> Ассистент:[SEP]{{- range .Messages }}{{- if eq .Role "user" }}Response }} Пользо
{ "stop": [ "<s>", "[SEP]", "Response }}\n\n Пользователь:"

Readme

Based on https://huggingface.co/mradermacher/YandexGPT-5-Lite-8B-instruct-GGUF

Feature Value
vision false
thinking false
tools false
Device Speed, token/s Context VRAM, gb Versions
RTX 3090 24gb ~105 4096 6.9 Q5_K_M,0.12.2
RTX 3090 24gb ~105 15360 9.2 Q5_K_M,0.12.2
RTX 2080ti 11gb ~74 4096 6.9 Q5_K_M,0.12.2
RTX 2080ti 11gb ~75 15360 9.2 Q5_K_M,0.12.2
M1 Max 32gb ~41 4096 6.6 Q5_K_M,0.12.2
M1 Max 32gb ~41 15360 8.2 Q5_K_M,0.12.2