A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
671b
449 Pulls Updated 2 weeks ago
5 Tags
360f2525dad4 • 404GB •
2 weeks ago
360f2525dad4 • 404GB •
2 weeks ago
49c763520a73 • 244GB •
2 weeks ago
417546ffc361 • 319GB •
2 weeks ago
360f2525dad4 • 404GB •
2 weeks ago