200 Downloads Updated 1 year ago
Name
4 models
capybarahermes-2.5-mistral:7b-q4_K_S
4.1GB · 32K context window · Text · 1 year ago
capybarahermes-2.5-mistral:7b-q4_K_M
4.4GB · 32K context window · Text · 1 year ago
capybarahermes-2.5-mistral:7b-q6_K
5.9GB · 32K context window · Text · 1 year ago
capybarahermes-2.5-mistral:7b-q8_0
7.7GB · 32K context window · Text · 1 year ago
This model is the launching partner of the capybara-dpo dataset build with ⚗️ distilabel. It’s a preference tuned OpenHermes-2.5-Mistral-7B.
CapybaraHermes has been preference tuned with LoRA and TRL for 3 epochs using argilla’s dpo mix 7k.
To test the impact on multi-turn performance we have used MTBench. We also include the Nous Benchmark results and Mistral-7B-Instruct-v0.2 for reference as it’s a strong 7B model on MTBench. The most interesting aspect in the context of the capybara-dpo dataset is the increased performance in MTBench Second Turn scores.
For the merge lovers, we also preference tuned Beagle14-7B with a mix of capybara-dpo and distilabel orca pairs using the same recipe as NeuralBeagle (see YALL - Yet Another LLM Leaderboard for reference)
CapybaraHermes-2.5-Mistral-7B uses ChatML as the prompt format, opening up a much more structured system for engaging the LLM in multi-turn chat dialogue. It uses a system prompt to establish rules and roles, and steer the interaction between the user and the model.