MoE implementation based on cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
277 Pulls Updated 10 months ago
Updated 10 months ago
10 months ago
6a342d3f0558 · 9.1GB
model
archllama
·
parameters12.9B
·
quantizationQ5_K_M
9.1GB
template
<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>ass
107B
params
{"num_ctx":16384,"repeat_penalty":1.1,"stop":["\u003c/s\u003e","USER:","ASSSISTANT:","[INST]","[/INS
208B
system
You are helpful coding assistant that can assist you in writing code in various programming languag
382B
Readme
Description
This model is a medium-sized MoE implementation based on cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
A 2x7b configuration offers better performance than a standard 7b model even if loaded in 4 bit. (9G VRAM)
If this 2x7b model is loaded in 4 bit the hellaswag score is .8270 which is higher than the base model achieves on its own in full precision.
Name | Quant method | Bits | Size (GB) | Max RAM required (GB) | Use case |
---|---|---|---|---|---|
exer/laser-dolphin-mixtral:2x7b-dpo-q5_K_M | Q5_K_M | 5 | 9.13 GB | 11.63 GB | large, very low quality loss - recommended |
exer/laser-dolphin-mixtral:2x7b-dpo-q6_K | Q6_K | 6 | 10.57 | 13.07 | very large, extremely low quality loss |
Prompt Format
This model follows the same prompt format as the aforementioned model.
Prompt format:
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant