124 3 weeks ago

Highly-rated roleplay model based on Qwen 2.5. Made by Allua-org (Huggingface).

tools

3 weeks ago

09aa57422b76 · 13GB ·

qwen2
·
32.8B
·
IQ3_XXS
Write {{char}}'s next reply in this fictional roleplay with {{user}}.
{{- if .Suffix }}<|fim_prefix|>{{ .Prompt }}<|fim_suffix|>{{ .Suffix }}<|fim_middle|> {{- else if .M

Readme

RP INK / I-MATRIX / 32B / I-QUANT

This model’s received some good praise for it’s creative writing skill. It’s particularly good at following the writing style of the prompts itself. For some reference - if it matters - I’d say it writes better than Magnum V4, even when at a lower bit depth than Magnum. To stuff as many parameters in as little VRAM as possible, weighted I-quants will be listed.

Note that I-quants forfeit some token generation speed relative to K-quants in exchange for storage efficiency. The XXS 3-bit I-quant is recommended for 16GB GPUs that have a well-supported parallel compute platform (ROCm or CUDA). Without such, 16GB GPUs should use Q2_K and below. Full GPU offload at IQ3_S with 16GB of VRAM is possible, but you will be pushing it. These models were taken from GGUF formats from Huggingface.

Original model (allura-org):

GGUF weighted quantizations (mradermacher):

OBLIGATORY_PICTURE_RPINK.png