3,497 Downloads Updated 1 year ago
Updated 1 year ago
1 year ago
e64344c032f0 · 6.0GB ·
fp32calibration_datav3.txtflash_attentionSYSTEM promptrepeat (but it shouldn’t with these quants)DeepSeek-Coder-V2 is an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. DeepSeek-Coder-V2 is further pre-trained from DeepSeek-Coder-V2-Base with 6 trillion tokens sourced from a high-quality and multi-source corpus.
Maximum context length: 128K
- q4_0 on a RTX3090 24GB can fit in VRAM up to 46K context