An upscaled NeMo with half its layers trained

238 4 months ago


From TheDrummer/Theia-21B-v1.

BeaverAI & Steelskull proudly present…

Theia 21B v1 🧘

An upscaled NeMo with half its layers trained on my special sauce 💦


According to the giant impact hypothesis, Theia orbited the Sun, nearly along the orbit of the proto-Earth, by staying close to one or the other of the Sun-Earth system’s two more stable Lagrangian points.


Theia was eventually perturbed away from that relationship by the gravitational influence of Jupiter, Venus, or both, resulting in a collision between Theia and Earth.




  • Mistral format
    • Make sure that there is a space between [/INST] and {{char}}: like [/INST] {{char}}:.
    • [/INST]{{char}}: (no space between) has a huge negative impact on performance.
  • Use DRY