hengwen/Sky-T1-32B-Preview

hengwen/

Sky-T1-32B-Preview

257 Downloads Updated 8 months ago

This is a 32B reasoning model trained from Qwen2.5-32B-Instruct with 17K data. The performance is on par with o1-preview model on both math and coding.

Models

View all →

Name

2 models

Size

Context

Input

Sky-T1-32B-Preview:q4_k_m

20GB · 32K context window · Text · 8 months ago

Sky-T1-32B-Preview:q4_k_m

20GB

32K

Text

Sky-T1-32B-Preview:q8_0

35GB · 32K context window · Text · 8 months ago

Sky-T1-32B-Preview:q8_0

35GB

32K

Text

Readme

https://huggingface.co/NovaSky-AI/Sky-T1-32B-Preview

Model Description

This is a 32B reasoning model trained from Qwen2.5-32B-Instruct with 17K data. The performance is on par with o1-preview model on both math and coding. Please see our blog post for more details.

Developed by: NovaSky Team from Sky Computing Lab at UC Berkeley.

Training Details

Training Data

17K verified correct responses from Qwen/QwQ-32B-Preview on coding, math. In addition, we add the science portion from the Still-2 paper.

@misc{sky_t1_2025, author = {NovaSky Team}, title = {Sky-T1: Fully open-source reasoning model with o1-preview performance in $450 budget}, howpublished = {https://novasky-ai.github.io/posts/sky-t1}, note = {Accessed: 2025-01-09}, year = {2025} }