Fixed memory prediction issues. Limit the number of layers loaded by GPU.

16B

46 Pulls Updated 3 months ago

6f29ca49a8a8 · 45B
{ "num_gpu": 18, "stop": [ "User:", "Assistant:" ] }