bramvanroy/fietje-2b-chat

bramvanroy/

fietje-2b-chat

713 Downloads Updated 1 year ago

Fietje: An open and efficient LLM for Dutch (chat)

Models

View all →

Name

6 models

Size

Context

Input

fietje-2b-chat:Q3_K_M

1.4GB · 2K context window · Text · 1 year ago

fietje-2b-chat:Q3_K_M

1.4GB

Text

fietje-2b-chat:Q4_K_M

1.7GB · 2K context window · Text · 1 year ago

fietje-2b-chat:Q4_K_M

1.7GB

Text

fietje-2b-chat:Q5_K_M

2.0GB · 2K context window · Text · 1 year ago

fietje-2b-chat:Q5_K_M

2.0GB

Text

fietje-2b-chat:Q6_K

2.3GB · 2K context window · Text · 1 year ago

fietje-2b-chat:Q6_K

2.3GB

Text

fietje-2b-chat:Q8_0

3.0GB · 2K context window · Text · 1 year ago

fietje-2b-chat:Q8_0

3.0GB

Text

fietje-2b-chat:f16

5.6GB · 2K context window · Text · 1 year ago

fietje-2b-chat:f16

5.6GB

Text

Readme

This repository contains quantized versions of BramVanroy/fietje-2b-chat.

Available quantization types and expected performance differences compared to base f16, higher perplexity=worse (from llama.cpp):

Q3_K_M  :  3.07G, +0.2496 ppl @ LLaMA-v1-7B
Q4_K_M  :  3.80G, +0.0532 ppl @ LLaMA-v1-7B
Q5_K_M  :  4.45G, +0.0122 ppl @ LLaMA-v1-7B
Q6_K    :  5.15G, +0.0008 ppl @ LLaMA-v1-7B
Q8_0    :  6.70G, +0.0004 ppl @ LLaMA-v1-7B
F16     : 13.00G              @ 7B

Quants were made with release b2777 of llama.cpp.