149 1 year ago

1 year ago

1559dbf39c77 · 15GB ·

qwen2
·
7.25B
·
F16
{{ if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{ if .Prompt }}<|im_start|>user
{ "stop": [ "<|im_start|>", "<|im_end|>" ] }

Readme

Introduction

Nxcode-CQ-7B-orpo is an Monolithic Preference Optimization without Reference Model fine-tune of Qwen/CodeQwen1.5-7B on 100k samples of high-quality ranking data.

Evalplus

EvalPlus pass@1
HumanEval 86.6
HumanEval+ 83.5
MBPP(v0.2.0) 82.3
MBPP+(v0.2.0) 70.4

We use a simple template to generate the solution for evalplus:

"Complete the following Python function:\n{prompt}"

Evalplus Leaderboard

Models HumanEval HumanEval+
GPT-4-Turbo (April 2024) 90.2 86.6
GPT-4 (May 2023) 88.4 81.17
GPT-4-Turbo (Nov 2023) 85.4 79.3
CodeQwen1.5-7B-Chat 83.5 78.7
claude-3-opus (Mar 2024) 82.9 76.8
DeepSeek-Coder-33B-instruct 81.1 75.0
WizardCoder-33B-V1.1 79.9 73.2
OpenCodeInterpreter-DS-33B 79.3 73.8
speechless-codellama-34B-v2.0 77.4 72
GPT-3.5-Turbo (Nov 2023) 76.8 70.7
Llama3-70B-instruct 76.2 70.7

https://huggingface.co/NTQAI/Nxcode-CQ-7B-orpo