132 7 months ago

Our smallest model is also Mixture of Experts. Having a total of 0.4b active and 1.3b total parameters, it beats models way bigger than it in efficiency and performance.

tools
1362c6e1ec84 · 20B
{
"temperature": 0.7
}