141 3 months ago

A 4B-parameter, 8-bit quantized inference model with /think (reasoning) and /no_think (fast response) modes.

tools thinking