Fixed memory prediction issues. Limit the number of layers loaded by GPU.

141 5 months ago

6f29ca49a8a8 · 45B
{
"num_gpu": 18,
"stop": [
"User:",
"Assistant:"
]
}