A family of efficient AI models under 10B parameters performant in science, math, and coding through innovative training techniques.

1b 3b 7b 10b

8,216 12 days ago

Readme

Falcon3 represents TII’s latest advancement in efficient language models under 10B parameters, focused on enhancing science, math, and code capabilities while maintaining training efficiency.

Key Features

  • Four sizes: 1B, 3B, 7B, 10B
  • Depth up-scaling technique used to create 10B model from 7B
  • Knowledge distillation for smaller models (1B, 3B)

Performance Highlights

  • falcon3:1b outperforms smollm2:1.7b, matches gemma2:2b
  • falcon3:10b achieves SOTA in under-13B category
  • Extended context length up to 32K tokens (8K for 1B model)

References

Hugging Face