mhomaid / capybarahermes-2.5-mistral-7b

88 Pulls Updated 7 months ago

Updated 7 months ago

7 months ago

e8536a2a8728 · 4.4GB

<|im_start|>system {system_message}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant

105B

Readme

This model is the launching partner of the capybara-dpo dataset build with ⚗️ distilabel. It’s a preference tuned OpenHermes-2.5-Mistral-7B.

CapybaraHermes has been preference tuned with LoRA and TRL for 3 epochs using argilla’s dpo mix 7k.

To test the impact on multi-turn performance we have used MTBench. We also include the Nous Benchmark results and Mistral-7B-Instruct-v0.2 for reference as it’s a strong 7B model on MTBench:

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	MTBench First Turn	MTBench Second Turn	Nous avg.	MTBench avg.
argilla/CapybaraHermes-2.5-Mistral-7B	43.8	73.35	57.07	42.44	8.24375	7.5625	54.16	7.903125
teknium/OpenHermes-2.5-Mistral-7B	42.75	72.99	52.99	40.94	8.25	7.2875	52.42	7.76875
Mistral-7B-Instruct-v0.2	38.5	71.64	66.82	42.29	7.8375	7.1	54.81	7.46875