25 11 months ago

internlm2.5_7b_distill is a distill model of internlm2.5-7b-chat, trained with a 9k CoT dataset. "psy" refers to the model with psychological system prompt while "_orpo" refers to the model further trained with orpo on safety issues