3,043 Downloads Updated 1 year ago
fp32
calibration_datav3.txt
flash_attention
SYSTEM
promptrepeat
(but it shouldn’t with these quants)DeepSeek-Coder-V2 is an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. DeepSeek-Coder-V2 is further pre-trained from DeepSeek-Coder-V2-Base with 6 trillion tokens sourced from a high-quality and multi-source corpus.
Maximum context length: 128K
- q4_0
on a RTX3090 24GB can fit in VRAM up to 46K context