71 4 months ago

Our first Mixture of Experts model, with 800 million active parameters, reaches state-of-the-art performance in the sub-1b range.

tools
1362c6e1ec84 · 20B
{
"temperature": 0.7
}