81 4 days ago

tools thinking 8b 32b
ollama run nishtahir/sera

Models

View all →

Readme

SERA (Unofficial)

Model Variants

Model HuggingFace Base Teacher SWE-bench Verified
SERA-32B allenai/SERA-32B Qwen 3-32B GLM-4.6 49.5% ± 1.9%
SERA-32B-GA allenai/SERA-32B-GA Qwen 3-32B GLM-4.5-Air 46.6% ± 0.7%
SERA-8B allenai/SERA-8B Qwen 3-8B GLM-4.6 31.7% ± 0.9%
SERA-8B-GA allenai/SERA-8B-GA Qwen 3-8B GLM-4.5-Air 31.7% ± 0.4%

All results evaluated at 32K context length. Standard deviations computed over 3 random seeds.

Performance

SWE-bench Verified (32K Context)

Model Type Resolve Rate
SkyRL-8B Open-source 9.4%
Nex-N1-8B Open-source 20.3%
SERA-8B Open-source 31.7%
Qwen 3-32B (base) Open-weight 24.4%
SWE-smith Open-source 32.6%
SkyRL-Agent Open-source 39.4%
DeepSWE Open-source 42.2%
SERA-32B Open-source 49.5%
Devstral-Small-2 (24B) Open-weight 50.0%
GLM-4.5-Air (110B) Open-weight 50.5%

Open-source: code, model weights, and data publicly available. Open-weight: model weights available but training data/code not fully released.