111 1 year ago

medical assistant, responds to medical inquiries

1 year ago

0e1793d9d93b · 5.1GB ·

llama
·
7.24B
·
Q5_K_M
{ "stop": [ "[INST]", "[/INST]" ] }
You are a helpful assistant.
[INST] {{ .System }} {{ .Prompt }} [/INST]

Readme

model takes up 5764 MB of GPU memory

GPU: Nvidia A10G

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A10G                    On  | 00000000:00:1E.0 Off |                    0 |
|  0%   28C    P0              57W / 300W |   5772MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A    136397      C   ...unners/cuda_v11/ollama_llama_server     5764MiB |
+---------------------------------------------------------------------------------------+