169pi/alpie-core

Alpie-Core brings together high reasoning accuracy, sustainable compute efficiency, and open accessibility, redefining what’s possible with 4-bit, high-performance AI.

Welcome to 169Pi’s Alpie-Core

Alpie-Core is one of the first 4-bit quantized reasoning models, a 32B parameter system developed by 169Pi team that matches or outperforms several full-precision frontier models. Built from the DeepSeek-R1-Distill-Qwen-32B backbone, it represents a major leap in efficient reasoning, sustainable AI, and democratized intelligence, all trained on just 8 NVIDIA Hopper GPUs.

Alpie-Core redefines what’s possible under limited resources by combining LoRA/QLoRA, groupwise-blockwise quantization, and synthetic data distillation, achieving state-of-the-art results on reasoning, coding, and math benchmarks — all while reducing memory footprint by over 75%. Designed for researchers, developers, and enterprises, Alpie-Core brings frontier-level reasoning to accessible, low-compute environments.

Get started

You can get started by downloading or running Alpie-Core with Ollama:

To pull the model:

ollama pull 169pi/alpie-core

To run it instantly:

ollama run 169pi/alpie-core

Alpie-Core can also be integrated programmatically for local or API-based workflows.

Quick Start with SDK

Access Alpie Core through our official Python SDK (pi169) for seamless API integration:

# Install the SDK
pip install pi169

# Set your API key
export ALPIE_API_KEY="your_key_here"

# Start using the CLI
pi169 "Explain 4-bit quantization in simple terms"

SDK Features

CLI Integration for quick command-line interactions
Streaming & Non-Streaming Chat Completions
Async/Await Support for high-performance concurrent requests
Clean, type-safe Python Interface (dataclasses, type hints)
Robust Error Handling with typed exceptions
Production-Ready Networking (retries, timeouts, httpx)
Fully Tested with pytest
Optimized for Reasoning Models
OpenAI-Compatible Client: Drop-in replacement for OpenAI SDK with full compatibility

Benchmarks

Alpie-Core is built for structured reasoning, step-by-step logic, and factual responses. It achieves outstanding performance across multiple benchmarks:

Feature Highlights

1. Technical Advancements

4-Bit Quantization (NF4): Achieves ∼8GB memory footprint with minimal accuracy loss
128K context length for extended reasoning, and based on your specific use cases
Fine-tunable: Fully customise models to your specific use case through parameter fine-tuning
LoRA + QLoRA Fine-Tuning: Retains reasoning fidelity under low-bit constraints
Groupwise + Blockwise Quantization: Reduces noise, enhances precision at scale
vLLM-based Inference: Enables low-latency and high-throughput deployment

2. API & Integration Ready

OpenAI-Compatible API: Drop-in replacement for GPT endpoints
Function Calling & Tool Use: Supports structured output and dynamic API linking
Streaming Output: Token-by-token real-time response generation
Configurable Guardrails: Safety, moderation, and content filters included

3. Sustainable and Accessible

Runs efficiently on consumer GPUs (16–24GB VRAM)
Up to 75% lower VRAM use vs. FP16 baselines
Significantly reduced carbon and energy footprint
Fully open under the Apache 2.0 License

Quantization

Format: NF4 (NormalFloat 4-bit)
Compression Ratio: 16:1
Technique: QLoRA + Double Quantization
Implementation: bitsandbytes (bnb_4bit_use_double_quant=True)
Inference: Mixed precision (FP16 compute, 4-bit storage)
Minimal reasoning loss (%)

License: Apache 2.0

Use freely for research, customisation, and commercial deployment without copyleft restrictions. Ideal for experimentation, extension, and open collaboration.

More about 169Pi

169Pi Hugging Face
169Pi PyPI Package
169Pi LinkedIn Updates

Alpie-Core brings together high reasoning accuracy, sustainable compute efficiency, and open accessibility, redefining what’s possible with 4-bit, high-performance AI.

Models

Readme