921 1 month ago

Small models made to run on mobile devices, developed by Liquid AI

350m 700m 1.2b 2.6b 8b

Models

View all →

Readme

Notes

Uploaded all but the Q8_0 as Unsloth DynamicQuant2 (DQ2) versions. These are Liquid AI’s on-device-first models using a hybrid Liquid architecture (gated convolutions + GQA) that’s 2x faster on CPU than standard transformers.

Temperature: A range of 0.4–0.6 works well for their instruction-following and reasoning tasks.


Description

Liquid Foundation Models 2 (LFM2) — a family of fast, efficient models optimized for edge devices (phones, laptops, vehicles). The hybrid Liquid architecture provides superior speed and memory efficiency over traditional transformers.

The key difference between these models is their architecture and capacity:

Dense Models:

  • lfm2:350m: Entry-level dense model for ultra-low resource environments.

  • lfm2:700m: Compact dense model, outperforms Gemma 3 1B IT.

  • lfm2:1.2b: Dense model that competes with Qwen3-1.7B (47% larger).

  • lfm2:2.6b: Highest-quality dense model, outperforms 3B+ class models.

Mixture-of-Experts (On-device MoE):

  • lfm2:8b-a1b: 8.3B total params but only 1.5B active per token — delivers 8B quality at 1.5B speed with sparse activation using 32 experts (top-4 activated).

Ideal for:

  • Fast on-device inference (CPU/NPU optimized)

  • Mobile AI applications and robotics

  • Low-latency chat and reasoning

  • English/Japanese-focused tasks with strong multilingual support


References

LFM2 Official Announcement

LFM2-2.6B Details

LFM2-8B-A1B MoE Details

HuggingFace LFM2 Collection