169pi/ alpie-core:reasoning

52 3 months ago

Alpie-Core brings together high reasoning accuracy, sustainable compute efficiency, and open accessibility, redefining what’s possible with 4-bit, high-performance AI.

thinking
ollama run 169pi/alpie-core:reasoning

Details

3 months ago

c7996d521ec5 · 20GB ·

qwen2
·
32.8B
·
Q4_K_M
qwen2
·
134M
·
F32
{{- if .System }}{{ .System }}{{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice
MIT License Copyright (c) 2023 DeepSeek Permission is hereby granted, free of charge, to any person
You are Alpie-Core, a safe, helpful, multilingual, and intelligent AI Reasoning model developed by 1
{ "num_predict": 16384, "stop": [ "<|im_start|>", "<|im_end|>", "<|e

Readme

169pi_cover.jpg

Welcome to 169Pi’s Alpie-Core

Alpie-Core is one of the first 4-bit quantized reasoning models, a 32B parameter system developed by 169Pi team that matches or outperforms several full-precision frontier models. Built from the DeepSeek-R1-Distill-Qwen-32B backbone, it represents a major leap in efficient reasoning, sustainable AI, and democratized intelligence, all trained on just 8 NVIDIA Hopper GPUs.

Alpie-Core redefines what’s possible under limited resources by combining LoRA/QLoRA, groupwise-blockwise quantization, and synthetic data distillation, achieving state-of-the-art results on reasoning, coding, and math benchmarks — all while reducing memory footprint by over 75%. Designed for researchers, developers, and enterprises, Alpie-Core brings frontier-level reasoning to accessible, low-compute environments.

Get started

You can get started by downloading or running Alpie-Core with Ollama:

To pull the model:

ollama pull 169pi/alpie-core

To run it instantly:

ollama run 169pi/alpie-core

Alpie-Core can also be integrated programmatically for local or API-based workflows.

Quick Start with SDK

Access Alpie Core through our official Python SDK (pi169) for seamless API integration:

# Install the SDK
pip install pi169

# Set your API key
export ALPIE_API_KEY="your_key_here"

# Start using the CLI
pi169 "Explain 4-bit quantization in simple terms"

SDK Features

  • CLI Integration for quick command-line interactions
  • Streaming & Non-Streaming Chat Completions
  • Async/Await Support for high-performance concurrent requests
  • Clean, type-safe Python Interface (dataclasses, type hints)
  • Robust Error Handling with typed exceptions
  • Production-Ready Networking (retries, timeouts, httpx)
  • Fully Tested with pytest
  • Optimized for Reasoning Models
  • OpenAI-Compatible Client: Drop-in replacement for OpenAI SDK with full compatibility

Benchmarks

Alpie-Core is built for structured reasoning, step-by-step logic, and factual responses. It achieves outstanding performance across multiple benchmarks:

SWE Bench Verified - Accuracy Comparison.png Test (1).png

Feature Highlights

1. Technical Advancements

  • 4-Bit Quantization (NF4): Achieves ∼8GB memory footprint with minimal accuracy loss

  • 128K context length for extended reasoning, and based on your specific use cases

  • Fine-tunable: Fully customise models to your specific use case through parameter fine-tuning

  • LoRA + QLoRA Fine-Tuning: Retains reasoning fidelity under low-bit constraints

  • Groupwise + Blockwise Quantization: Reduces noise, enhances precision at scale

  • vLLM-based Inference: Enables low-latency and high-throughput deployment

2. API & Integration Ready

  • OpenAI-Compatible API: Drop-in replacement for GPT endpoints

  • Function Calling & Tool Use: Supports structured output and dynamic API linking

  • Streaming Output: Token-by-token real-time response generation

  • Configurable Guardrails: Safety, moderation, and content filters included

3. Sustainable and Accessible

  • Runs efficiently on consumer GPUs (16–24GB VRAM)

  • Up to 75% lower VRAM use vs. FP16 baselines

  • Significantly reduced carbon and energy footprint

  • Fully open under the Apache 2.0 License

download (2).png

Quantization

  • Format: NF4 (NormalFloat 4-bit)

  • Compression Ratio: 16:1

  • Technique: QLoRA + Double Quantization

  • Implementation: bitsandbytes (bnb_4bit_use_double_quant=True)

  • Inference: Mixed precision (FP16 compute, 4-bit storage)

  • Minimal reasoning loss (%)

License: Apache 2.0

Use freely for research, customisation, and commercial deployment without copyleft restrictions. Ideal for experimentation, extension, and open collaboration.

More about 169Pi

169Pi Hugging Face
169Pi PyPI Package
169Pi LinkedIn Updates

Screenshot 2025-10-18 at 2.30.01 PM.png