Details

Updated 1 year ago

1 year ago

a1245824de01 · 2.2GB ·

model

archopenelm

parameters3.04B

quantizationQ5_K_M

2.2GB

system

You are a helpful assistant. Perform the task to the best of your ability.

74B

license

2.7kB

params

{ "num_ctx": 2048, "stop": [ "<|system|>", "<|user|>", "<|assistant|

147B

template

{{- if .System }} <|system|> {{ .System }} </s> {{- end }} <|user|> {{ .Prompt }} </s> <|assistant|>

100B

OpenELM: An Efficient Language Model Family with Open Training and Inference Framework

Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, Maxwell Horton, Yanzi Jin, Chenfan Sun, Iman Mirzadeh, Mahyar Najibi, Dmitry Belenko, Peter Zatloukal, Mohammad Rastegari

We introduce OpenELM, a family of Open Efficient Language Models. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to enhanced accuracy. We pretrained OpenELM models using the CoreNet library. We release both pretrained and instruction tuned models with 270M, 450M, 1.1B and 3B parameters. We release the complete framework, encompassing data preparation, training, fine-tuning, and evaluation procedures, alongside multiple pre-trained checkpoints and training logs, to facilitate open research.

Our pre-training dataset contains RefinedWeb, deduplicated PILE, a subset of RedPajama, and a subset of Dolma v1.6, totaling approximately 1.8 trillion tokens. Please check license agreements and terms of these datasets before using them.

(from: https://huggingface.co/apple/OpenELM-3B-Instruct)

Model conversion by https://huggingface.co/SanctumAI/OpenELM-3B-Instruct-GGUF

Apple OpenELM: An Efficient Language Model Family with Open Training and Inference Framework

Details

Readme

OpenELM: An Efficient Language Model Family with Open Training and Inference Framework