MatBot 1.3

MatBot 1.3 is an experimental fine tuned language model built on SmolLM 135M and trained on the GSM8K math reasoning dataset. The project explored whether a very small model could be pushed toward stronger math performance through narrow specialization.

The outcome was scientifically useful. The model converged during training but overfit so aggressively that it stopped producing answers altogether, instead looping its own output template. MatBot 1.3 stands as a clear example of training collapse in extremely small architectures.

Benchmark Comparison

Approximate GSM8K performance:

Model	Parameters	GSM8K Accuracy	Notes
SmolLM 135M (base)	135M	~6 percent	Baseline model with no math specialization
MatBot 1.3	135M	~0 percent	Collapsed into deterministic template repetition

This highlights the primary lesson. Narrow fine tuning on a small model can degrade performance rather than improve it.

Goals

Test whether direct fine tuning on GSM8K can improve a tiny general model.
Study collapse modes such as template looping and extreme memorization.
Provide an accessible example of overfitting dynamics in small models.

Intended Use

MatBot 1.3 is suited for:

Research on overfitting and training instability
Studying tiny model limitations
Educational demonstrations
Benchmark experimentation

It is not suited for real problem solving, general reasoning or production use.

License

This project is released under the MIT License.

Dataset Citation

MatBot 1.3 was trained on the GSM8K dataset. Please cite the dataset as follows:

@article{cobbe2021gsm8k,
  title={Training Verifiers to Solve Math Word Problems},
  author={Cobbe, Karl and Kosaraju, Vineet and Bavarian, Mohammad and Chen, Mark and Jun, Heewoo and Kaiser, Lukasz and Plappert, Matthias and Tworek, Jerry and Hilton, Jacob and Nakano, Reiichiro and Hesse, Christopher and Schulman, John},
  journal={arXiv preprint arXiv:2110.14168},
  year={2021}
}

Summary

MatBot 1.3 is not a capable math model. It is a deliberately small research artifact that reveals how narrow fine tuning can overwhelm the capacity of tiny architectures. While it does not solve math problems, it provides a useful example of training collapse and serves as a compact case study in the limits of small scale supervised fine tuning.

A tiny experimental math model that overtrained itself into absolute confusion.

Models

Readme