323 1 year ago

A 3B parameter GPT-like model fine-tuned on a mix of publicly available datasets using DPO.