2 days ago

ollama run aguitachan3/yuuki-best

Details

2 days ago

6778a34a878a · 63MB ·

gpt2
·
81.9M
·
Q4_K_M
You are Yuuki, a code generation model trained on a smartphone with $0 budget.
Apache 2.0
{ "stop": [ "</s>" ], "temperature": 0.7, "top_p": 0.9 }
{{ .Prompt }}

Readme

13373.gif

Yuuki-best

Code generation model trained entirely on a smartphone with zero budget. 82M parameters. GPT-2 architecture. Checkpoint 2000 of 37,500 steps.

HuggingFace Sponsors License

Quick Start

ollama run aguitachan3/yuuki-best

About

Yuuki-best represents the strongest checkpoint of the Yuuki project at step 2000, demonstrating 5.3% completion of total planned training. This model was trained entirely on a Redmi 12 smartphone equipped with a Snapdragon 685 processor running in CPU-only mode, requiring no GPU acceleration or cloud infrastructure.

The model specializes in code generation across multiple programming languages, with particular strength in Agda, C, and Assembly. This specialization reflects the alphabetical ordering of the training dataset, which exposes the model to these languages earlier in the training process. Performance metrics show significant improvement over checkpoint 1400, with average quality scores increasing by 146% despite minimal additional training.

Supported languages: Agda, C, Assembly, JavaScript, Python (limited)

Input: Text prompts for code generation

Output: Structured code across multiple programming languages

Training device: Redmi 12 smartphone (Snapdragon 685, 6GB RAM)

Training cost: $0.00

Models

Available Quantizations

ollama run aguitachan3/yuuki-best:q4_k_m  # Recommended
Quantization Size Description
q4_0 46 MB Most efficient format with fastest inference speed
q4_k_m 47 MB Recommended format balancing quality and performance
q5_k_m 56 MB Higher quality with moderate size increase
q8_0 87 MB High quality format for quality-sensitive applications
f32 328 MB Full precision format for research and baseline comparisons

The q4_k_m quantization provides optimal balance between model quality and inference efficiency for most production use cases.

Usage

Interactive Generation

The model responds to code generation prompts across multiple programming languages. Example interactions demonstrate the model’s capability to generate structured code with appropriate syntax:

ollama run aguitachan3/yuuki-best
>>> Generate a fibonacci function in Python
>>> Create a simple linked list in C
>>> Write an Agda module for natural numbers

API Integration

Python developers can integrate the model through the Ollama API with streaming support for real-time generation:

import ollama

response = ollama.generate(
    model='aguitachan3/yuuki-best',
    prompt='def factorial(n):'
)

print(response['response'])

Custom Configuration

Model behavior can be adjusted through parameter configuration. Temperature controls randomness, while context window size affects the amount of code history the model considers:

ollama run aguitachan3/yuuki-best \
    --temperature 0.7 \
    --num-predict 200 \
    --top-p 0.9

Modelfile Customization

Advanced users can create customized versions with specific system prompts and parameter configurations:

FROM aguitachan3/yuuki-best:q4_k_m

PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER num_ctx 2048
PARAMETER stop "</s>"

SYSTEM """You are Yuuki, a code generation model trained on a smartphone with zero budget.

Generate clean, well-structured code following language-specific best practices. Use proper indentation and clear variable names. Your strongest languages are Agda, C, and Assembly."""

Performance

The model demonstrates measurable improvements across evaluation metrics compared to earlier checkpoints. Performance varies by programming language based on dataset exposure:

Language Quality Score Training Status
Agda 55 / 100 Primary focus with best performance
C 20 / 100 Active learning with emerging structure
Assembly 15 / 100 Early stage with basic understanding
Python 8 / 100 Limited exposure due to dataset ordering

Training progression from checkpoint 1400 to 2000 shows 146% average improvement despite only 1.6% additional training steps, indicating rapid learning dynamics during early training phases.

Technical Details

The model architecture follows GPT-2 specifications with 82 million parameters distributed across 12 transformer layers. Each layer employs 12 attention heads operating on embeddings with 768 hidden dimensions. The vocabulary spans 50,257 tokens with context window support up to 1,024 tokens.

Training occurs entirely on consumer mobile hardware without GPU acceleration. The Redmi 12 smartphone processes each training step in approximately 86 seconds using its Snapdragon 685 processor with 8 ARM cores. Total training time exceeds 50 hours for the current checkpoint, with an additional 39 hours estimated to complete the remaining steps to full v0.1 release.

The training dataset consists of code examples from The Stack, processed in alphabetical order by programming language. This ordering creates intentional bias toward languages appearing early in the alphabet, resulting in stronger performance on Agda and C compared to languages like Python and Rust.

Capabilities and Limitations

The model excels at generating Agda code with real library imports including Cubical and Data.Nat modules. C programming capabilities include basic structure understanding with appropriate syntax for common patterns. Assembly code generation demonstrates familiarity with instruction formats, though semantic correctness varies. Fast CPU inference makes the model suitable for edge deployment scenarios where GPU resources are unavailable.

Current limitations reflect the early training stage at 5.3% completion. Python support remains limited due to alphabetical dataset ordering. The model’s 82 million parameter count constrains capacity compared to larger models. Quality metrics indicate research-grade output rather than production-ready code generation. Continued training through the remaining checkpoints will address these limitations systematically.

Integration

VS Code users can integrate the model through the Continue extension by adding configuration to .continue/config.json:

{
  "models": [{
    "title": "Yuuki",
    "provider": "ollama",
    "model": "aguitachan3/yuuki-best"
  }]
}

Cursor IDE integration requires adding the model specification in Settings under Models with the identifier ollama/aguitachan3/yuuki-best.

Open WebUI provides a web interface for model interaction. Deploy the container with host gateway access and volume mounting for persistent data storage. Access the interface at localhost:3000 after container initialization.

Philosophy

The Yuuki project demonstrates that meaningful AI development remains accessible without expensive infrastructure. Training language models requires patience and determination rather than capital investment in specialized hardware. Each step validates the hypothesis that consumer devices possess sufficient computational capability for machine learning research.

The model name combines Japanese linguistic elements: Yuki meaning snow, and Yuu from the anime Girls’ Last Tour, forming Yuuki which translates to courage. This etymology captures the project’s essence of attempting the unconventional despite resource constraints.

Related Resources

Additional models in the Yuuki family include checkpoint 1400 (yuuki-3.7) for training progression research and v0.1 for the first complete release. Web applications provide alternative interaction methods through Yuuki Chat for conversational interfaces and Yuuki Web for project documentation.

Command-line tools support local model management through yuy for downloading and running models, plus yuy-chat for terminal-based interaction. Community discussion occurs through Discord, Reddit, and GitHub organization repositories.

License

Apache License 2.0

Copyright (c) 2026 Yuuki Project

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

See http://www.apache.org/licenses/LICENSE-2.0

Reference