akx/viking-7b

Viking 7B is a model pretrained on Finnish, English, Swedish, Danish, Norwegian, Icelandic and code.

Viking 7B is a 7B parameter decoder-only transformer pretrained on Finnish, English, Swedish, Danish, Norwegian, Icelandic and code. It has been trained on 2 trillion tokens. Viking 7B is a fully open source model and is made available under the Apache 2.0 License.

Viking was created in a collaboration between the TurkuNLP group of the University of Turku, SiloGen from Silo AI,and High Performance Language Technologies (HPLT). Training was conducted on the LUMI supercomputer, using compute resources generously provided by CSC - IT Center for Science, Finland.

This project is part of an ongoing effort to create open source large language models for non-English and especially low resource languages like Finnish. The mode is fluent in Finnish, English, the Scandinavian languages and capable of basic translation between them. It is also able to understand and generate code.

NOTE: Viking is a base model which needs further fine tuning for most use cases.

This GGML quantization was done with akx/ggify with llama.cpp b2901 with a small modification to the conversion script to support the Viking tokenizer.

Viking 7B is a model pretrained on Finnish, English, Swedish, Danish, Norwegian, Icelandic and code.

Models

Readme