LESSTHANSUPER/MAGNUM_V4-Mistral_Small:9b_IQ4

Suite of weighted quantizations for Magnum V4 models. Available in 9b, 12b, 22b, and 27b. Made by Anthracite-org (Huggingface).

Details

Updated 3 months ago

3 months ago

e8a2ac6e63a7 · 5.2GB ·

model

archgemma2

parameters9.24B

quantizationIQ4_XS

5.2GB

template

{{- if .Suffix }}<|fim_prefix|>{{ .Prompt }}<|fim_suffix|>{{ .Suffix }}<|fim_middle|> {{- else if .M

1.6kB

system

Write {{char}}'s next reply in this fictional roleplay with {{user}}.

69B

params

{ "stop": [ "<|im_start|>", "<|im_end|>" ] }

59B

MAGNUM V4 / I-MATRIX / 9-27B / I-QUANT

A reliable storytelling model. Personal model of choice with static quants. This model has a lot of info in its dataset, so many references you make will be picked up even without lorebooks. To stuff as many parameters in as little VRAM as possible, weighted K and I-quants will be listed, along with multiple distillations as the original creator of the model has offered many. Note that I-quants forfeit some token generation speed relative to K-quants in exchange for storage efficiency.

For your specific VRAM requirements, the following is recommended:

For 4GB GPUs: IQ2_S. This collection’s limit is 6GB, but if you wish to use Magnum V4, the 2-bit I-quant is available. Do not expect general inference skills of any kind, this might as well be strict roleplay only.

For 6GB GPUs: 9b_IQ4_XS. It’ll work if it’s the only thing running. Video streaming may slow it down. If it does, try IQ3_S.

For 8GB GPUs: 12b_IQ4_XS. It works fast enough on 8GB GPUs without needed to drop to 9b models.

For 12GB GPUs: 12b_Q6_K. It’ll work fine, though if you want to experiment there are larger models listed that will fit in VRAM.

For 16GB GPUs: 27b_Q3_K_M or 27b_IQ3_S. These are recommended, but if your GPU struggles with this, the 22b_Q4_K_M works.

For >=20GB GPUs: Any model listed will fit fine in VRAM.

These models were taken from GGUF formats from Huggingface.

Original model (anthracite-org):

GGUF weighted quantizations (mradermacher):

Suite of weighted quantizations for Magnum V4 models. Available in 9b, 12b, 22b, and 27b. Made by Anthracite-org (Huggingface).

Details

Readme