63 1 year ago

It is an LLM fine-tuned from Llama-3.2-3B-Instruct, capable of reasoning in the format <reasoning>...</reasoning><answer>...</answer>. Its capability might improve with further training.

tools
ollama run ImFineThanks/Integrator-1-R1-ZERO-3B

Details

1 year ago

b4075145a909 · 2.0GB ·

llama
·
3.21B
·
Q4_K_M
As you are a great math teacher, you have no problem solving integrals. Your name is Integrator-1 Yo
MIT
{ "num_ctx": 4096, "stop": [ "<|start_header_id|>", "<|end_header_id|>",
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|> {{- if .System }} {{ .System }

Readme

Integrator-1: A Specialized Language Model for Calculus Integrals

Overview

Integrator-1 is a fine-tuned Large Language Model (LLM) optimized for solving calculus integrals. Built on the Llama-3.2-3B-Instruct base model, it uses reinforcement learning and Low-Rank Adaptation (LoRA) to achieve improved accuracy in integral-solving tasks. The project demonstrates the efficacy of domain-specific fine-tuning for enhancing LLM performance in mathematical applications.

Dataset Creation

The training dataset consists of 10,000 synthetic integrals, generated via a custom Python script. Key features include:

  • Format: Integrals and solutions are formatted in LaTeX (e.g., \int_{-8}^{8} -x \, dx).

  • Solutions: Numerically computed and rounded to integers, constrained to the range [-20, 20].

  • Difficulty Tiers: Three levels of complexity to ensure progressive learning.

    Implementation:

    Functions generar_integral() and crear_dataset() handle generation and deduplication. The script is included in the repository for reproducibility.

Fine-Tuning Process

Integrator-1 was fine-tuned using Unsloth’s framework on Google Colab’s free tier. The process involved:

  • Base Model: Llama-3.2-3B-Instruct.
  • Techniques:
  • Reinforcement Learning (GRPO/RHLF) for step-by-step reasoning.
  • LoRA with a rank of 8 for memory-efficient adaptation.
  • Training Parameters: Steps: 360 Learning Rate: 5e-6 Optimizer: AdamW (8-bit) Scheduler: Cosine Script: Adapted from Unsloth’s training examples, available in the repository.

Testing and Results

A testing script (test_model()) evaluated Integrator-1 against the base model on a 50-integral test set. Results:

  • Base Model Accuracy: 34%
  • Integrator-1 Accuracy: 66%
  • Improvement: 94% relative increase

These metrics highlight the benefit of specialization, though further training could enhance performance.

Lessons Learned

  • Domain-specific fine-tuning significantly boosts performance in niche tasks.
  • LoRA and Unsloth enable efficient training on constrained hardware.
  • Open-source tools streamline development and deployment.

Usage

To use Integrator-1, download the model with the following command ollama run ImFineThanks/Integrator-1-R1-ZERO-3B and to test it, follow the instructions in the repository’s scripts.

Captura de pantalla 2025-03-03 184153.png Captura de pantalla 2025-03-03 184408.png