lucataco/deepseek-v3-64k

lucataco/

deepseek-v3-64k:latest

19 Downloads Updated 10 months ago

A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

Updated 10 months ago

10 months ago

74b583cd0ec8 · 404GB ·

model

archdeepseek2

parameters671B

quantizationQ4_K_M

404GB

license

14kB

params

{ "num_ctx": 64000, "stop": [ "<｜begin▁of▁sentence｜>", "<｜end▁of

164B

template

{{- range $i, $_ := .Messages }} {{- if eq .Role "user" }}<｜User｜> {{- else if eq .Role "assista

359B

Readme

Note: this model requires Ollama 0.5.5 or later.

DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.

This is a modified version of deepseek-v3 Q4_K_M with context length set to 64k