second_constantine/t-lite-it-2.1:8b

second_constantine/

t-lite-it-2.1:8b

21 Downloads Updated 3 days ago

T-lite-it-2.1 is an efficient Russian model built upon the Qwen 3 architecture, featuring significant improvements in instruction following and adds support for tool-calling capabilities (quantized Q4_K_M)

tools thinking 8b

Updated 3 days ago

3 days ago

313cea45e9fc · 5.0GB ·

archqwen3

parameters8.19B

quantizationQ4_K_M

5.0GB

{{- $lastUserIdx := -1 -}} {{- range $idx, $msg := .Messages -}} {{- if eq $msg.Role "user" }}{{ $la

1.5kB

{ "repeat_penalty": 1, "stop": [ "<|im_start|>", "<|im_end|>" ], "te

99B

Device	Speed, token/s	Context	VRAM, gb	Versions
RTX 3090 24gb	~117	4096	6.6	Q5_K_M, 0.13.3
RTX 3090 24gb	~117	15360	8.3	Q5_K_M, 0.13.3
RTX 3090 24gb	~119	4096	5.8	Q4_K_M, 0.13.4
RTX 3090 24gb	~129	15360	7.5	Q4_K_M, 0.13.4
RTX 2080ti 11gb	~77	4096	6.6	Q5_K_M, 0.13.3
RTX 2080ti 11gb	~77	15360	8.3	Q5_K_M, 0.13.3
RTX 2080ti 11gb	~84	4096	5.8	Q4_K_M, 0.13.4
RTX 2080ti 11gb	~84	15360	7.5	Q4_K_M, 0.13.4
RTX 3070ti Mobile 8gb	~65	4096	6.6	Q5_K_M, 0.13.3
RTX 3070ti Mobile 8gb	~37	15360	8.3 (11%/89% CPU/GPU)	Q5_K_M, 0.13.3
RTX 3070ti Mobile 8gb	~71	4096	5.8	Q4_K_M, 0.13.4
RTX 3070ti Mobile 8gb	~71	15360	7.5	Q4_K_M, 0.13.4
M1 Max 32gb	~36	4096	6.3	Q5_K_M, 0.13.3
M1 Max 32gb	~37	15360	7.2	Q5_K_M, 0.13.3
M1 Max 32gb	~36	4096	5.5	Q4_K_M, 0.13.4
M1 Max 32gb	~36	15360	6.3	Q4_K_M, 0.13.4