richardyoung/kimi-k2

richardyoung/

kimi-k2

Kimi-K2 is a large language model built with a Mixture-of-Experts (MoE) architecture: sparse activation so that only a subset of the total parameters is used per input. Hugging Face +3 deepinfra.com +3 Medium +3 Total parameter count: ~1 trillion parame

No models have been pushed.

Readme

No readme