qwen:1.8b-chat-v1.5-q3_K

Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters

Details

Updated 2 years ago

2 years ago

1e7a582ce4d6 · 1.0GB ·

model

archqwen2

parameters1.84B

quantizationQ3_K_M

1.0GB

template

{{ if .System }}<|im_start|>system {{ .System }}<|im_end|>{{ end }}<|im_start|>user {{ .Prompt }}<|i

130B

license

Tongyi Qianwen RESEARCH LICENSE AGREEMENT Tongyi Qianwen Release Date: November 30, 2023 By clicking

7.3kB

params

{ "stop": [ "<|im_start|>", "<|im_end|>" ] }

59B

Qwen 2 is now available here.

Qwen is a series of transformer-based large language models by Alibaba Cloud, pre-trained on a large volume of data, including web texts, books, code, etc.

New in Qwen 1.5

6 model sizes, including 0.5B, 1.8B, 4B (default), 7B, 14B, 32B (new) and 72B
- ollama run qwen:0.5b
- ollama run qwen:1.8b
- ollama run qwen:4b
- ollama run qwen:7b
- ollama run qwen:14b
- ollama run qwen:32b
- ollama run qwen:72b
- ollama run qwen:110b
Significant performance improvement in human preference for chat models
Multilingual support of both base and chat models
Stable support of 32K context length for models of all sizes

The original Qwen model is offered in four different parameter sizes: 1.8B, 7B, 14B, and 72B.

Features

Low-cost deployment: the minimum memory requirement for inference is less than 2GB.
Large-scale high-quality training corpora: Models are pre-trained on over 2.2 trillion tokens, including Chinese, English, multilingual texts, code, and mathematics, covering general and professional fields. The distribution of the pre-training corpus has been optimized through a large number of ablation experiments.
Good performance: Qwen supports long context lengths (8K on the 1.8b, 7b and 14b parameter models, and 32K on the 72b parameter model), and significantly surpasses existing open-source models of similar scale on multiple Chinese and English downstream evaluation tasks (including common-sense, reasoning, code, mathematics, etc.), and even surpasses some larger-scale models in several benchmarks.
More comprehensive vocabulary coverage: Compared with other open-source models based on Chinese and English vocabularies, Qwen uses a vocabulary of over 150K tokens. This vocabulary is more friendly to multiple languages, enabling users to directly further enhance the capability for certain languages without expanding the vocabulary.
System prompt: Qwen can realize role playing, language style transfer, task setting, and behavior-setting by using a system prompt.

Reference

GitHub

Hugging Face

Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters

Details

Readme

New in Qwen 1.5

Features

Reference