34 Downloads Updated 6 days ago
ollama run aiasistentworld/Llama-3.1-8B-Instruct-STO-Master
The Llama-3.1-8B-Instruct-STO-Master is a high-performance fine-tune of Meta’s Llama-3.1-8B-Instruct. This model represents the “Master Version” (Model E) of an extensive research project aimed at pushing the boundaries of 8B parameter architectures.
Unlike traditional Supervised Fine-Tuning (SFT), this model was developed using the STO (Specialized Task Optimization) method. This methodology focuses on “Reasoning over Recall,” forcing the model to understand the underlying logic of a prompt rather than simply predicting the next most likely token.
For more information on the synthetic data generation used in this project, visit: LLMResearch - Synthetic Data
Evaluation was performed using a sample limit of 250 (due to hardware constraints) across four major benchmarks: Hellaswag, ARC Challenge, GSM8K, and MMLU.
| Benchmark | Meta Llama 3.1 8B Base | STO-Master (Model E) | Status |
|---|---|---|---|
| MMLU General | 69.53% | 69.78% | ✅ Superior |
| ARC Challenge | 52.80% | 53.60% | 🏆 Record Logic |
| Hellaswag | 70.80% | 70.80% | 🟢 Perfect Recovery |
| Moral Scenarios | 59.60% | 59.20% | 🟢 Stable Alignment |
We encourage the community to run their own independent benchmarks on this model. Our internal results show that the model excels in academic writing, professional analysis, and complex STEM tasks.
Author: AlexH
Organization: LLMResearch.net
@misc{alexh2026llama31sto,
author = {AlexH},
title = {Llama-3.1-8B-Instruct-STO-Master: Pushing the limits of 8B architectures},
year = {2026},
publisher = {HuggingFace},
organization = {LLMResearch.net},
howpublished = {\url{https://huggingface.co/AlexH/Llama-3.1-8B-Instruct-STO-Master}}
}