azure99/

blossom-v6.2:36b

6 Downloads Updated 2 days ago

8b 14b 32b 36b

Updated 2 days ago

2 days ago

eb68a1a6b012 · 22GB ·

archseed_oss

·

parameters36.2B

·

quantizationQ4_K_M

22GB

A chat between a user and an artificial intelligence assistant. The assistant gives helpful, detaile

927B

{ "repeat_penalty": 1.05, "temperature": 0.5, "top_k": 0, "top_p": 0.85 }

65B

Readme

BLOSSOM-V6.2

💻Github • 🚀Blossom Chat Demo

Introduction

Blossom is a powerful open-source conversational large language model that provides reproducible post-training data, dedicated to delivering an open, powerful, and cost-effective locally accessible general-purpose model for everyone.

Chat Model	Resource	Base Model
Blossom-V6.2-36B	Demo GGUF Ollama	Seed-OSS-36B-Base
Blossom-V6.2-32B	Demo GGUF Ollama	Qwen2.5-32B
Blossom-V6.2-14B	Demo GGUF Ollama	Qwen3-14B-Base
Blossom-V6.2-8B	Demo GGUF Ollama	Qwen3-8B-Base

Hint: Across the vast majority of use cases, Blossom-V6.2-36B outperforms Blossom-V6.2-32B.

You can find the training data here: Blossom-V6.2-SFT-Stage1 (1 epoch)、Blossom-V6.2-SFT-Stage2 (3 epoch).

Data Synthesis Workflow Overview

Primarily employs three cost-effective models: Deepseek-V3.1, Gemini 2.5 Flash, and Qwen3-235B-A22B-Instruct-2507 (denoted as A, B, C)—to regenerate responses under different scenarios using tailored synthesis strategies.

For example:

In objective scenarios like mathematics (where answers are unique), Model A first generates responses as a “teacher.” If reference answers exist in the source data, Model B verifies the correctness of A’s responses against them. If no reference answers exist, Model C generates a second response, and Model B checks consistency between A and C’s outputs. Inconsistent responses are filtered out.
For subjective scenarios, three models cross-evaluate each other. For instance, Models A and B generate responses to a question, and Model C evaluates which is better. The superior response may be retained as training data or used for preference data construction. To mitigate model bias, roles (respondent/evaluator) are randomly assigned to A, B, and C in each instance.

Additional rule-based filtering is applied, such as:

N-Gram filtering to remove data with many repetitions.
Discarding questions containing toxic content that triggers teacher model refusals.

Further technical details will be released in the future. The data is synthesized by the 🌸BlossomData framework.

# **BLOSSOM-V6.2**

[💻Github](https://github.com/Azure99/BlossomLM) • [🚀Blossom Chat Demo](https://blossom-chat.com/)

### Introduction

Blossom is a powerful open-source conversational large language model that provides reproducible post-training data, dedicated to delivering an open, powerful, and cost-effective locally accessible general-purpose model for everyone.

| Chat Model                                                   | Resource                                                     | Base Model        |
| ------------------------------------------------------------ | ------------------------------------------------------------ | ----------------- |
| [Blossom-V6.2-36B](https://huggingface.co/Azure99/Blossom-V6.2-36B) | [Demo](https://huggingface.co/spaces/Azure99/Blossom-V6.2-36B-Demo) [GGUF](https://huggingface.co/Azure99/Blossom-V6.2-36B-GGUF) [Ollama](https://ollama.com/azure99/blossom-v6.2:36b) | Seed-OSS-36B-Base |
| [Blossom-V6.2-32B](https://huggingface.co/Azure99/Blossom-V6.2-32B) | [Demo](https://huggingface.co/spaces/Azure99/Blossom-V6.2-32B-Demo) [GGUF](https://huggingface.co/Azure99/Blossom-V6.2-32B-GGUF) [Ollama](https://ollama.com/azure99/blossom-v6.2:32b) | Qwen2.5-32B       |
| [Blossom-V6.2-14B](https://huggingface.co/Azure99/Blossom-V6.2-14B) | [Demo](https://huggingface.co/spaces/Azure99/Blossom-V6.2-14B-Demo) [GGUF](https://huggingface.co/Azure99/Blossom-V6.2-14B-GGUF) [Ollama](https://ollama.com/azure99/blossom-v6.2:14b) | Qwen3-14B-Base    |
| [Blossom-V6.2-8B](https://huggingface.co/Azure99/Blossom-V6.2-8B) | [Demo](https://huggingface.co/spaces/Azure99/Blossom-V6.2-8B-Demo) [GGUF](https://huggingface.co/Azure99/Blossom-V6.2-8B-GGUF) [Ollama](https://ollama.com/azure99/blossom-v6.2:8b) | Qwen3-8B-Base     |

Hint: Across the vast majority of use cases, **Blossom-V6.2-36B** outperforms **Blossom-V6.2-32B**.

You can find the training data here: [Blossom-V6.2-SFT-Stage1](https://huggingface.co/datasets/Azure99/blossom-v6.2-sft-stage1) (1 epoch)、[Blossom-V6.2-SFT-Stage2](https://huggingface.co/datasets/Azure99/blossom-v6.2-sft-stage2) (3 epoch).

### **Data Synthesis Workflow Overview**

Primarily employs three cost-effective models: Deepseek-V3.1, Gemini 2.5 Flash, and Qwen3-235B-A22B-Instruct-2507 (denoted as A, B, C)—to regenerate responses under different scenarios using tailored synthesis strategies.

For example:

- In objective scenarios like mathematics (where answers are unique), Model A first generates responses as a "teacher." If reference answers exist in the source data, Model B verifies the correctness of A's responses against them. If no reference answers exist, Model C generates a second response, and Model B checks consistency between A and C's outputs. Inconsistent responses are filtered out.
- For subjective scenarios, three models cross-evaluate each other. For instance, Models A and B generate responses to a question, and Model C evaluates which is better. The superior response may be retained as training data or used for preference data construction. To mitigate model bias, roles (respondent/evaluator) are randomly assigned to A, B, and C in each instance.

Additional rule-based filtering is applied, such as:

- N-Gram filtering to remove data with many repetitions.
- Discarding questions containing toxic content that triggers teacher model refusals.

Further technical details will be released in the future. The data is synthesized by the [🌸BlossomData](https://github.com/Azure99/BlossomData) framework.

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)