63 1 year ago

It is an LLM fine-tuned from Llama-3.2-3B-Instruct, capable of reasoning in the format <reasoning>...</reasoning><answer>...</answer>. Its capability might improve with further training.

tools
ollama run ImFineThanks/Integrator-1-R1-ZERO-3B

Applications

Claude Code
Claude Code ollama launch claude --model ImFineThanks/Integrator-1-R1-ZERO-3B
Codex App
Codex App ollama launch codex-app --model ImFineThanks/Integrator-1-R1-ZERO-3B
OpenClaw
OpenClaw ollama launch openclaw --model ImFineThanks/Integrator-1-R1-ZERO-3B
Hermes Agent
Hermes Agent ollama launch hermes --model ImFineThanks/Integrator-1-R1-ZERO-3B
Codex
Codex ollama launch codex --model ImFineThanks/Integrator-1-R1-ZERO-3B
OpenCode
OpenCode ollama launch opencode --model ImFineThanks/Integrator-1-R1-ZERO-3B

Models

View all →

Readme

Integrator-1: A Specialized Language Model for Calculus Integrals

Overview

Integrator-1 is a fine-tuned Large Language Model (LLM) optimized for solving calculus integrals. Built on the Llama-3.2-3B-Instruct base model, it uses reinforcement learning and Low-Rank Adaptation (LoRA) to achieve improved accuracy in integral-solving tasks. The project demonstrates the efficacy of domain-specific fine-tuning for enhancing LLM performance in mathematical applications.

Dataset Creation

The training dataset consists of 10,000 synthetic integrals, generated via a custom Python script. Key features include:

  • Format: Integrals and solutions are formatted in LaTeX (e.g., \int_{-8}^{8} -x \, dx).

  • Solutions: Numerically computed and rounded to integers, constrained to the range [-20, 20].

  • Difficulty Tiers: Three levels of complexity to ensure progressive learning.

    Implementation:

    Functions generar_integral() and crear_dataset() handle generation and deduplication. The script is included in the repository for reproducibility.

Fine-Tuning Process

Integrator-1 was fine-tuned using Unsloth’s framework on Google Colab’s free tier. The process involved:

  • Base Model: Llama-3.2-3B-Instruct.
  • Techniques:
  • Reinforcement Learning (GRPO/RHLF) for step-by-step reasoning.
  • LoRA with a rank of 8 for memory-efficient adaptation.
  • Training Parameters: Steps: 360 Learning Rate: 5e-6 Optimizer: AdamW (8-bit) Scheduler: Cosine Script: Adapted from Unsloth’s training examples, available in the repository.

Testing and Results

A testing script (test_model()) evaluated Integrator-1 against the base model on a 50-integral test set. Results:

  • Base Model Accuracy: 34%
  • Integrator-1 Accuracy: 66%
  • Improvement: 94% relative increase

These metrics highlight the benefit of specialization, though further training could enhance performance.

Lessons Learned

  • Domain-specific fine-tuning significantly boosts performance in niche tasks.
  • LoRA and Unsloth enable efficient training on constrained hardware.
  • Open-source tools streamline development and deployment.

Usage

To use Integrator-1, download the model with the following command ollama run ImFineThanks/Integrator-1-R1-ZERO-3B and to test it, follow the instructions in the repository’s scripts.

Captura de pantalla 2025-03-03 184153.png Captura de pantalla 2025-03-03 184408.png