234 Downloads Updated 7 months ago
Updated 7 months ago
7 months ago
382840ebf944 · 5.4GB
Configured longer sequence length. I recommend running with flash attention and kv-cache quantization if you run out of VRAM.