A family of efficient AI models under 10B parameters performant in science, math, and coding through innovative training techniques.

1b 3b 7b 10b

ollama run falcon3:3b-instruct-q8_0

curl http://localhost:11434/api/chat \
  -d '{
    "model": "falcon3:3b-instruct-q8_0",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='falcon3:3b-instruct-q8_0',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'falcon3:3b-instruct-q8_0',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 year ago

1 year ago

05a32e830399 · 3.4GB ·

model

archllama

parameters3.23B

quantizationQ8_0

3.4GB

license

Falcon 3 TII Falcon License December 2024 FalconLLM.tii.ae Introductory note This license is, in par

13kB

template

{{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} <|{{ .Role }}|> {

218B

params

{ "stop": [ "<|system|>", "<|user|>", "<|end|>", "<|assistant|>"

101B

Readme

Falcon3 represents TII’s latest advancement in efficient language models under 10B parameters, focused on enhancing science, math, and code capabilities while maintaining training efficiency.

Key Features

Four sizes: 1B, 3B, 7B, 10B
Depth up-scaling technique used to create 10B model from 7B
Knowledge distillation for smaller models (1B, 3B)

Performance Highlights

falcon3:1b outperforms smollm2:1.7b, matches gemma2:2b
falcon3:10b achieves SOTA in under-13B category
Extended context length up to 32K tokens (8K for 1B model)

References

Hugging Face