llama3-8b-instruct-dpo-zh-loftq, DPO beta: 0.5, lora rank 128, with LoftQ lora

Updated 4 months ago

No models have been pushed.

Readme

No readme