Updated 2 days ago
ollama run aguitachan3/yuuki-best:q4_0
Updated 2 days ago
2 days ago
ebc1ebac0fbf · 61MB ·
Code generation model trained entirely on a smartphone with zero budget. 82M parameters. GPT-2 architecture. Checkpoint 2000 of 37,500 steps.
ollama run aguitachan3/yuuki-best
Yuuki-best represents the strongest checkpoint of the Yuuki project at step 2000, demonstrating 5.3% completion of total planned training. This model was trained entirely on a Redmi 12 smartphone equipped with a Snapdragon 685 processor running in CPU-only mode, requiring no GPU acceleration or cloud infrastructure.
The model specializes in code generation across multiple programming languages, with particular strength in Agda, C, and Assembly. This specialization reflects the alphabetical ordering of the training dataset, which exposes the model to these languages earlier in the training process. Performance metrics show significant improvement over checkpoint 1400, with average quality scores increasing by 146% despite minimal additional training.
Supported languages: Agda, C, Assembly, JavaScript, Python (limited)
Input: Text prompts for code generation
Output: Structured code across multiple programming languages
Training device: Redmi 12 smartphone (Snapdragon 685, 6GB RAM)
Training cost: $0.00
ollama run aguitachan3/yuuki-best:q4_k_m # Recommended
| Quantization | Size | Description |
|---|---|---|
| q4_0 | 46 MB | Most efficient format with fastest inference speed |
| q4_k_m | 47 MB | Recommended format balancing quality and performance |
| q5_k_m | 56 MB | Higher quality with moderate size increase |
| q8_0 | 87 MB | High quality format for quality-sensitive applications |
| f32 | 328 MB | Full precision format for research and baseline comparisons |
The q4_k_m quantization provides optimal balance between model quality and inference efficiency for most production use cases.
The model responds to code generation prompts across multiple programming languages. Example interactions demonstrate the model’s capability to generate structured code with appropriate syntax:
ollama run aguitachan3/yuuki-best
>>> Generate a fibonacci function in Python
>>> Create a simple linked list in C
>>> Write an Agda module for natural numbers
Python developers can integrate the model through the Ollama API with streaming support for real-time generation:
import ollama
response = ollama.generate(
model='aguitachan3/yuuki-best',
prompt='def factorial(n):'
)
print(response['response'])
Model behavior can be adjusted through parameter configuration. Temperature controls randomness, while context window size affects the amount of code history the model considers:
ollama run aguitachan3/yuuki-best \
--temperature 0.7 \
--num-predict 200 \
--top-p 0.9
Advanced users can create customized versions with specific system prompts and parameter configurations:
FROM aguitachan3/yuuki-best:q4_k_m
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER num_ctx 2048
PARAMETER stop "</s>"
SYSTEM """You are Yuuki, a code generation model trained on a smartphone with zero budget.
Generate clean, well-structured code following language-specific best practices. Use proper indentation and clear variable names. Your strongest languages are Agda, C, and Assembly."""
The model demonstrates measurable improvements across evaluation metrics compared to earlier checkpoints. Performance varies by programming language based on dataset exposure:
| Language | Quality Score | Training Status |
|---|---|---|
| Agda | 55 / 100 | Primary focus with best performance |
| C | 20 / 100 | Active learning with emerging structure |
| Assembly | 15 / 100 | Early stage with basic understanding |
| Python | 8 / 100 | Limited exposure due to dataset ordering |
Training progression from checkpoint 1400 to 2000 shows 146% average improvement despite only 1.6% additional training steps, indicating rapid learning dynamics during early training phases.
The model architecture follows GPT-2 specifications with 82 million parameters distributed across 12 transformer layers. Each layer employs 12 attention heads operating on embeddings with 768 hidden dimensions. The vocabulary spans 50,257 tokens with context window support up to 1,024 tokens.
Training occurs entirely on consumer mobile hardware without GPU acceleration. The Redmi 12 smartphone processes each training step in approximately 86 seconds using its Snapdragon 685 processor with 8 ARM cores. Total training time exceeds 50 hours for the current checkpoint, with an additional 39 hours estimated to complete the remaining steps to full v0.1 release.
The training dataset consists of code examples from The Stack, processed in alphabetical order by programming language. This ordering creates intentional bias toward languages appearing early in the alphabet, resulting in stronger performance on Agda and C compared to languages like Python and Rust.
The model excels at generating Agda code with real library imports including Cubical and Data.Nat modules. C programming capabilities include basic structure understanding with appropriate syntax for common patterns. Assembly code generation demonstrates familiarity with instruction formats, though semantic correctness varies. Fast CPU inference makes the model suitable for edge deployment scenarios where GPU resources are unavailable.
Current limitations reflect the early training stage at 5.3% completion. Python support remains limited due to alphabetical dataset ordering. The model’s 82 million parameter count constrains capacity compared to larger models. Quality metrics indicate research-grade output rather than production-ready code generation. Continued training through the remaining checkpoints will address these limitations systematically.
VS Code users can integrate the model through the Continue extension by adding configuration to .continue/config.json:
{
"models": [{
"title": "Yuuki",
"provider": "ollama",
"model": "aguitachan3/yuuki-best"
}]
}
Cursor IDE integration requires adding the model specification in Settings under Models with the identifier ollama/aguitachan3/yuuki-best.
Open WebUI provides a web interface for model interaction. Deploy the container with host gateway access and volume mounting for persistent data storage. Access the interface at localhost:3000 after container initialization.
The Yuuki project demonstrates that meaningful AI development remains accessible without expensive infrastructure. Training language models requires patience and determination rather than capital investment in specialized hardware. Each step validates the hypothesis that consumer devices possess sufficient computational capability for machine learning research.
The model name combines Japanese linguistic elements: Yuki meaning snow, and Yuu from the anime Girls’ Last Tour, forming Yuuki which translates to courage. This etymology captures the project’s essence of attempting the unconventional despite resource constraints.
Additional models in the Yuuki family include checkpoint 1400 (yuuki-3.7) for training progression research and v0.1 for the first complete release. Web applications provide alternative interaction methods through Yuuki Chat for conversational interfaces and Yuuki Web for project documentation.
Command-line tools support local model management through yuy for downloading and running models, plus yuy-chat for terminal-based interaction. Community discussion occurs through Discord, Reddit, and GitHub organization repositories.
Apache License 2.0
Copyright (c) 2026 Yuuki Project
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
See http://www.apache.org/licenses/LICENSE-2.0