MoE implementation based on cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser

277 10 months ago

Readme

Description

This model is a medium-sized MoE implementation based on cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser

A 2x7b configuration offers better performance than a standard 7b model even if loaded in 4 bit. (9G VRAM)

If this 2x7b model is loaded in 4 bit the hellaswag score is .8270 which is higher than the base model achieves on its own in full precision.

Name Quant method Bits Size (GB) Max RAM required (GB) Use case
exer/laser-dolphin-mixtral:2x7b-dpo-q5_K_M Q5_K_M 5 9.13 GB 11.63 GB large, very low quality loss - recommended
exer/laser-dolphin-mixtral:2x7b-dpo-q6_K Q6_K 6 10.57 13.07 very large, extremely low quality loss

Prompt Format

This model follows the same prompt format as the aforementioned model.

Prompt format:

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant