Hacking away at ML & AI
-
deepseek-v3-64k
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
19 Pulls 1 Tag Updated 10 months ago
-
openreasoning-nemotron
OpenReasoning-Nemotron-32B is a large language model (LLM) which is a derivative of Qwen2.5-32B. It is a reasoning model that is post-trained for reasoning about math, code and science solution generation
32b18 Pulls 1 Tag Updated 2 weeks ago
-
bonito-v1
open-source model for conditional task generation
13 Pulls 1 Tag Updated 1 year ago