Fixed memory prediction issues. Limit the number of layers loaded by GPU.

139 5 months ago

732caedf08d1 · 112B
{{ if .System }}{{ .System }}
{{ end }}{{ if .Prompt }}User: {{ .Prompt }}
{{ end }}Assistant: {{ .Response }}