158 Downloads Updated 1 year ago
Nxcode-CQ-7B-orpo is an Monolithic Preference Optimization without Reference Model fine-tune of Qwen/CodeQwen1.5-7B on 100k samples of high-quality ranking data.
| EvalPlus | pass@1 |
|---|---|
| HumanEval | 86.6 |
| HumanEval+ | 83.5 |
| MBPP(v0.2.0) | 82.3 |
| MBPP+(v0.2.0) | 70.4 |
We use a simple template to generate the solution for evalplus:
"Complete the following Python function:\n{prompt}"
| Models | HumanEval | HumanEval+ |
|---|---|---|
| GPT-4-Turbo (April 2024) | 90.2 | 86.6 |
| GPT-4 (May 2023) | 88.4 | 81.17 |
| GPT-4-Turbo (Nov 2023) | 85.4 | 79.3 |
| CodeQwen1.5-7B-Chat | 83.5 | 78.7 |
| claude-3-opus (Mar 2024) | 82.9 | 76.8 |
| DeepSeek-Coder-33B-instruct | 81.1 | 75.0 |
| WizardCoder-33B-V1.1 | 79.9 | 73.2 |
| OpenCodeInterpreter-DS-33B | 79.3 | 73.8 |
| speechless-codellama-34B-v2.0 | 77.4 | 72 |
| GPT-3.5-Turbo (Nov 2023) | 76.8 | 70.7 |
| Llama3-70B-instruct | 76.2 | 70.7 |