An upscaled NeMo with half its layers trained
238 Pulls Updated 4 months ago
Updated 4 months ago
4 months ago
7dfb0eb6d349 · 14GB
model
archllama
·
parameters20.4B
·
quantizationQ5_K_M
14GB
template
[INST] <<SYS>>{{ .System }}<</SYS>>
{{ .Prompt }} [/INST]
58B
params
{
"stop": [
"[INST]",
"[/INST]",
"<<SYS>>",
"<</SYS>>"
]
}
91B
Readme
From TheDrummer/Theia-21B-v1.
BeaverAI & Steelskull proudly present…
Theia 21B v1 🧘
An upscaled NeMo with half its layers trained on my special sauce 💦
According to the giant impact hypothesis, Theia orbited the Sun, nearly along the orbit of the proto-Earth, by staying close to one or the other of the Sun-Earth system’s two more stable Lagrangian points.
Theia was eventually perturbed away from that relationship by the gravitational influence of Jupiter, Venus, or both, resulting in a collision between Theia and Earth.
Description
- This is a finetune of Steelskull’s NeMoria 21B model.
- You can check out his official merges here: https://huggingface.co/Steelskull.
- The finetuning process involved completion, RP, and instruct training.
- Testers have reported enhanced RP capabilities, better storytelling, and richer prose.
Links
- Original: https://huggingface.co/TheDrummer/Theia-21B-v1
- GGUF: https://huggingface.co/TheDrummer/Theia-21B-v1-GGUF
Usage
- Mistral format
- Make sure that there is a space between
[/INST]
and{{char}}:
like[/INST] {{char}}:
. [/INST]{{char}}:
(no space between) has a huge negative impact on performance.
- Make sure that there is a space between
- Use DRY