169pi/
alpie-core:research

35 1 month ago

Alpie-Core brings together high reasoning accuracy, sustainable compute efficiency, and open accessibility, redefining what’s possible with 4-bit, high-performance AI.

thinking

1 month ago

c7996d521ec5 · 20GB ·

qwen2
·
32.8B
·
Q4_K_M
qwen2
·
134M
·
F32
{{- if .System }}{{ .System }}{{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice
MIT License Copyright (c) 2023 DeepSeek Permission is hereby granted, free of charge, to any person
You are Alpie-Core, a safe, helpful, multilingual, and intelligent AI Reasoning model developed by 1
{ "num_predict": 16384, "stop": [ "<|im_start|>", "<|im_end|>", "<|e

Readme

169pi_cover.jpg

Welcome to 169Pi’s Alpie-Core

Alpie-Core is one of the first 4-bit quantized reasoning models, a 32B parameter system developed by 169Pi team that matches or outperforms several full-precision frontier models. Built from the DeepSeek-R1-Distill-Qwen-32B backbone, it represents a major leap in efficient reasoning, sustainable AI, and democratized intelligence, all trained on just 8 NVIDIA Hopper GPUs.

Alpie-Core redefines what’s possible under limited resources by combining LoRA/QLoRA, groupwise-blockwise quantization, and synthetic data distillation, achieving state-of-the-art results on reasoning, coding, and math benchmarks — all while reducing memory footprint by over 75%. Designed for researchers, developers, and enterprises, Alpie-Core brings frontier-level reasoning to accessible, low-compute environments.

Get started

You can get started by downloading or running Alpie-Core with Ollama:

To pull the model:

ollama pull 169pi/alpie-core

To run it instantly:

ollama run 169pi/alpie-core

Alpie-Core can also be integrated programmatically for local or API-based workflows.

Benchmarks

Alpie-Core is built for structured reasoning, step-by-step logic, and factual responses. It achieves MMLU 81.28% | GSM8K 92.75% | BBH 85.12% | SWE-Bench Verified 57.8% | SciQ 98.0% | HumanEval 57.23% :

SWE Bench Verified - Accuracy Comparison.png

Test (1).png

Feature Highlights

1. Technical Advancements

  • 4-Bit Quantization (NF4): Achieves ∼8GB memory footprint with minimal accuracy loss
  • 128K context length for extended reasoning, and based on your specific use cases
  • Fine-tunable: Fully customise models to your specific use case through parameter fine-tuning.
  • LoRA + QLoRA Fine-Tuning: Retains reasoning fidelity under low-bit constraints
  • Groupwise + Blockwise Quantization: Reduces noise, enhances precision at scale
  • vLLM-based Inference: Enables low-latency and high-throughput deployment

2. API & Integration Ready

  • OpenAI-Compatible API: Drop-in replacement for GPT endpoints
  • Function Calling & Tool Use: Supports structured output and dynamic API linking
  • Streaming Output: Token-by-token real-time response generation
  • Configurable Guardrails: Safety, moderation, and content filters included

3. Sustainable and Accessible

  • Runs efficiently on consumer GPUs (16–24GB VRAM)
  • Up to 75% lower VRAM use vs. FP16 baselines
  • Significantly reduced carbon and energy footprint
  • Fully open under the Apache 2.0 License

download (2).png

Quantization

  • Format: NF4 (NormalFloat 4-bit)
  • Compression Ratio: 16:1
  • Technique: QLoRA + Double Quantization
  • Implementation: bitsandbytes (bnb_4bit_use_double_quant=True)
  • Inference: Mixed precision (FP16 compute, 4-bit storage)
  • Minimal reasoning loss (%)

License: Apache 2.0

Use freely for research, customisation, and commercial deployment without copyleft restrictions. Ideal for experimentation, extension, and open collaboration.

More about 169Pi

169Pi Hugging Face

169Pi LinkedIn Updates