533 4 weeks ago

Ultra-fast edge model for data extraction and tool use

ollama run LiquidAI/lfm2.5-350m:q6_k

Details

4 weeks ago

bb805c7f66ed · 293MB ·

lfm2
·
354M
·
Q6_K
<|startoftext|>{{ if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{ if .Prompt }}<
You are a helpful assistant trained by Liquid AI.
{ "repeat_penalty": 1.05, "stop": [ "<|im_start|>", "<|im_end|>", "<

Readme

Liquid AI

LFM2.5-350M

LFM2.5 is a new family of hybrid models designed for on-device deployment. It builds on the LFM2 architecture with extended pre-training and reinforcement learning.

  • Best-in-class performance: A 350M model rivaling much larger models, bringing high-quality AI to your pocket.
  • Fast edge inference: 313 tok/s decode on AMD CPU, 188 tok/s on Snapdragon Gen4. Runs under 1GB of memory with day-one support for llama.cpp, MLX, and vLLM.
  • Scaled training: Extended pre-training from 10T to 28T tokens and large-scale multi-stage reinforcement learning.

Find more information about LFM2.5-350M in our blog post.

🗒️ Model Details

LFM2.5-350M is a general-purpose text-only model with the following features:

  • Number of parameters: 350M
  • Number of layers: 16 (10 double-gated LIV convolution blocks + 6 GQA blocks)
  • Training budget: 28T tokens
  • Context length: 32,768 tokens
  • Vocabulary size: 65,536
  • Knowledge cutoff: Mid-2024
  • Languages: English, Arabic, Chinese, French, German, Japanese, Korean, Portuguese, Spanish
  • Generation parameters:
    • temperature: 0.1
    • top_k: 50
    • repetition_penalty: 1.05

We recommend using it for data extraction, structured outputs, and tool use. It is not recommended for knowledge-intensive tasks and programming.

📊 Performance

Benchmarks

Model GPQA Diamond MMLU-Pro IFEval IFBench Multi-IF
LFM2.5-350M 30.64 20.01 76.96 40.69 44.92
LFM2-350M 27.58 19.29 64.96 18.20 32.92
Granite 4.0-H-350M 22.32 13.14 61.27 17.22 28.70
Granite 4.0-350M 25.91 12.84 53.48 15.98 24.21
Qwen3.5-0.8B (Instruct) 27.41 37.42 59.94 22.87 41.68
Qwen3.5-0.8B (Thinking) 19.29 -* 32.93 22.00 26.44
Gemma 3 1B IT 23.89 14.04 63.49 20.33 44.25
Model CaseReportBench BFCLv3 BFCLv4 τ²-Bench Telecom τ²-Bench Retail
LFM2.5-350M 32.45 44.11 21.86 18.86 17.84
LFM2-350M 11.67 22.95 12.29 10.82 5.56
Granite 4.0-H-350M 12.44 43.07 13.28 13.74 6.14
Granite 4.0-350M 0.84 39.58 13.73 2.92 6.14
Qwen3.5-0.8B (Instruct) 13.83 35.08 18.70 12.57 6.14
Qwen3.5-0.8B (Thinking) 0.39 39.64 25.39 14.33 7.02
Gemma 3 1B IT 2.28 16.61 7.17 9.36 6.43

*Evaluation could not be completed due to doom looping.

CPU Inference

GPU Inference

📬 Contact