331 2 years ago

A 3B parameter GPT-like model fine-tuned on a mix of publicly available datasets using DPO.