Fixed memory prediction issues. Limit the number of layers loaded by GPU.
149 Pulls Updated 7 months ago
Updated 7 months ago
7 months ago
5a5b9ec355a6 · 8.9GB
model
archdeepseek2
·
parameters15.7B
·
quantizationQ4_0
8.9GB
params
{
"num_gpu": 18,
"stop": [
"User:",
"Assistant:"
]
}
45B
template
{{ if .System }}{{ .System }}
{{ end }}{{ if .Prompt }}User: {{ .Prompt }}
{{ end }}Assistant: {{
112B
license
DEEPSEEK LICENSE AGREEMENT
Version 1.0, 23 October 2023
Copyright (c) 2023 DeepSeek
Section I: PR
14kB
Readme
No readme