It is an LLM fine-tuned from Llama-3.2-3B-Instruct, capable of reasoning in the format <reasoning>...</reasoning><answer>...</answer>. Its capability might improve with further training.

Details

Updated 1 year ago

1 year ago

b4075145a909 · 2.0GB ·

model

archllama

parameters3.21B

quantizationQ4_K_M

2.0GB

system

As you are a great math teacher, you have no problem solving integrals. Your name is Integrator-1 Yo

436B

license

MIT

params

{ "num_ctx": 4096, "stop": [ "<|start_header_id|>", "<|end_header_id|>",

127B

template

{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|> {{- if .System }} {{ .System }

1.5kB

Integrator-1: A Specialized Language Model for Calculus Integrals

Overview

Integrator-1 is a fine-tuned Large Language Model (LLM) optimized for solving calculus integrals. Built on the Llama-3.2-3B-Instruct base model, it uses reinforcement learning and Low-Rank Adaptation (LoRA) to achieve improved accuracy in integral-solving tasks. The project demonstrates the efficacy of domain-specific fine-tuning for enhancing LLM performance in mathematical applications.

Dataset Creation

The training dataset consists of 10,000 synthetic integrals, generated via a custom Python script. Key features include:

Format: Integrals and solutions are formatted in LaTeX (e.g., \int_{-8}^{8} -x \, dx).
Solutions: Numerically computed and rounded to integers, constrained to the range [-20, 20].
Difficulty Tiers: Three levels of complexity to ensure progressive learning.

Implementation:

Functions generar_integral() and crear_dataset() handle generation and deduplication. The script is included in the repository for reproducibility.

Fine-Tuning Process

Integrator-1 was fine-tuned using Unsloth’s framework on Google Colab’s free tier. The process involved:

Base Model: Llama-3.2-3B-Instruct.
Techniques:
Reinforcement Learning (GRPO/RHLF) for step-by-step reasoning.
LoRA with a rank of 8 for memory-efficient adaptation.
Training Parameters: Steps: 360 Learning Rate: 5e-6 Optimizer: AdamW (8-bit) Scheduler: Cosine Script: Adapted from Unsloth’s training examples, available in the repository.

Testing and Results

A testing script (test_model()) evaluated Integrator-1 against the base model on a 50-integral test set. Results:

Base Model Accuracy: 34%
Integrator-1 Accuracy: 66%
Improvement: 94% relative increase

These metrics highlight the benefit of specialization, though further training could enhance performance.

Lessons Learned

Domain-specific fine-tuning significantly boosts performance in niche tasks.
LoRA and Unsloth enable efficient training on constrained hardware.
Open-source tools streamline development and deployment.

Usage

To use Integrator-1, download the model with the following command ollama run ImFineThanks/Integrator-1-R1-ZERO-3B and to test it, follow the instructions in the repository’s scripts.