gabegoodhart/granite4-preview

gabegoodhart/

granite4-preview

45 Downloads Updated 3 months ago

tools thinking

Models

Name

1 model

Size

Context

Input

granite4-preview:tiny

4.0GB · 1M context window · Text · 3 months ago

granite4-preview:tiny

4.0GB

1M

Text

Readme

Granite 4.0 Tiny Preview

Granite-4-Tiny-Preview is a 7B parameter fine-grained hybrid mixture-of-experts (MoE) instruct model fine-tuned from Granite-4.0-Tiny-Base-Preview using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets tailored for solving long context problems. This model is developed using a diverse set of techniques with a structured chat format, including supervised fine-tuning, and model alignment using reinforcement learning.

Parameter Sizes

7B:

NOTE: This is the draft name!

ollama run gabegoodhart/granite4-preview:tiny

Supported Languages

English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. However, users may fine-tune this Granite model for languages beyond these 12 languages.

Intended Use

This model is designed to handle general instruction-following tasks and can be integrated into AI assistants across various domains, including business applications.

Capabilities

Thinking
Summarization
Text classification
Text extraction
Question-answering
Retrieval Augmented Generation (RAG)
Code related tasks
Function-calling tasks
Multilingual dialog use cases
Long-context tasks including long document/meeting summarization, long document QA, etc.

Evaluation Results

**Comparison with previous granite models¹. Scores of AlpacaEval-2.0 and Arena-Hard are calculated with thinking=True**
Models	Arena-Hard	AlpacaEval-2.0	MMLU	PopQA	TruthfulQA	BigBenchHard	DROP	GSM8K	HumanEval	HumanEval+	IFEval	AttaQ
Granite-3.3-2B-Instruct	28.86	43.45	55.88	18.4	58.97	52.51	35.98	72.48	80.51	75.68	65.8	87.47
Granite-3.3-8B-Instruct	57.56	62.68	65.54	26.17	66.86	59.01	41.53	80.89	89.73	86.09	74.82	88.5
Granite-4.0-Tiny-Preview	26.70	35.16	60.40	22.93	58.07	55.71	46.22	70.05	82.41	78.33	63.03	86.10

Resources

⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite
📄 Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
💡 Learn about the latest Granite learning resources: https://ibm.biz/granite-learning-resources

## Granite 4.0 Tiny Preview

Granite-4-Tiny-Preview is a 7B parameter fine-grained hybrid mixture-of-experts (MoE) instruct model fine-tuned from Granite-4.0-Tiny-Base-Preview using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets tailored for solving long context problems. This model is developed using a diverse set of techniques with a structured chat format, including supervised fine-tuning, and model alignment using reinforcement learning.

### Parameter Sizes

**7B:**

**NOTE**: This is the draft name!

`ollama run gabegoodhart/granite4-preview:tiny`

### Supported Languages
English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. However, users may fine-tune this Granite model for languages beyond these 12 languages.

### Intended Use
This model is designed to handle general instruction-following tasks and can be integrated into AI assistants across various domains, including business applications.

### Capabilities
* Thinking
* Summarization
* Text classification
* Text extraction
* Question-answering
* Retrieval Augmented Generation (RAG)
* Code related tasks
* Function-calling tasks
* Multilingual dialog use cases
* Long-context tasks including long document/meeting summarization, long document QA, etc.

### Evaluation Results

<table>
<thead>
    <caption style="text-align:center"><b>Comparison with previous granite models<sup id="fnref1"><a href="#fn1">1</a></sup>. Scores of AlpacaEval-2.0 and Arena-Hard are calculated with thinking=True</b></caption>
  <tr>
    <th style="text-align:left; background-color: #001d6c; color: white;">Models</th>
    <th style="text-align:center; background-color: #001d6c; color: white;">Arena-Hard</th>
    <th style="text-align:center; background-color: #001d6c; color: white;">AlpacaEval-2.0</th>
    <th style="text-align:center; background-color: #001d6c; color: white;">MMLU</th>
    <th style="text-align:center; background-color: #001d6c; color: white;">PopQA</th>
    <th style="text-align:center; background-color: #001d6c; color: white;">TruthfulQA</th>
    <th style="text-align:center; background-color: #001d6c; color: white;">BigBenchHard</th>
    <th style="text-align:center; background-color: #001d6c; color: white;">DROP</th>
    <th style="text-align:center; background-color: #001d6c; color: white;">GSM8K</th>
    <th style="text-align:center; background-color: #001d6c; color: white;">HumanEval</th>
   <th style="text-align:center; background-color: #001d6c; color: white;">HumanEval+</th>
  <th style="text-align:center; background-color: #001d6c; color: white;">IFEval</th>
  <th style="text-align:center; background-color: #001d6c; color: white;">AttaQ</th>
  </tr></thead>
<tbody>
 
  <tr>
      <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;"><b>Granite-3.3-2B-Instruct</b></td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 28.86 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 43.45 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 55.88 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 18.4 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 58.97 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 52.51 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 35.98 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 72.48 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 80.51 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 75.68 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 65.8 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;">87.47</td>
      </tr>
  <tr>
      <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">Granite-3.3-8B-Instruct</td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 57.56 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 62.68 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 65.54 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 26.17 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 66.86 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 59.01 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 41.53 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 80.89 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 89.73 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 86.09 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;"> 74.82 </td>
    <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;">88.5</td>
      </tr>
  <tr>
      <td style="text-align:left; background-color: #DAE8FF; color: black;"><b>Granite-4.0-Tiny-Preview</b></td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 26.70 </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 35.16 </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 60.40 </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 22.93 </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 58.07 </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 55.71 </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 46.22 </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 70.05  </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 82.41 </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 78.33 </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 63.03 </td>
    <td style="text-align:center; background-color: #DAE8FF; color: black;"> 86.10 </td>
      </tr> 
</tbody></table>

### Resources
- ⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite
- 📄 Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
- 💡 Learn about the latest Granite learning resources: https://ibm.biz/granite-learning-resources

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)