jedi-knight/ qwen3.5-27b-64k-tools:v1.0

8 Downloads Updated 16 hours ago

一个可以在A5000显卡或4090上完整运行的QWen3.5大模型（上下文64k），具备调用工具的能力，适合本地部署龙虾和Hermes

tools thinking

ollama run jedi-knight/qwen3.5-27b-64k-tools:v1.0

curl http://localhost:11434/api/chat \
  -d '{
    "model": "jedi-knight/qwen3.5-27b-64k-tools:v1.0",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='jedi-knight/qwen3.5-27b-64k-tools:v1.0',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'jedi-knight/qwen3.5-27b-64k-tools:v1.0',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 16 hours ago

16 hours ago

2e5efbb56699 · 13GB ·

model

archqwen35

·

parameters26.9B

·

quantizationQ3_K_M

13GB

system

Comprehensive reasoning mode. If request is simple, then just answer. Otherwise, analyze the request

162B

params

{ "num_ctx": 65536, "presence_penalty": 1.5, "repeat_last_n": 512, "repeat_penalty":

180B

template

{{ .Prompt }}

13B

Readme

Qwen3.5-27B 64K-Tools

A customized distribution of Qwen3.5-27B with three key modifications:

Extended Context — 64K tokens (default 4K → 65,536)
Tool Use Enabled — Native function calling via official Qwen3.5 renderer/parser
100% GPU on 24GB — Fits entirely on RTX 3090 / 4090 / A5000

Quick Start

ollama pull jedi-knight/qwen3.5-27b-64k-tools
ollama run jedi-knight/qwen3.5-27b-64k-tools

Hardware Requirements

GPU	VRAM	Status
RTX 3090	24 GB	✅ 100% GPU
RTX 4090	24 GB	✅ 100% GPU
RTX A5000	24 GB	✅ 100% GPU
RTX 4080	16 GB	❌ Requires CPU offload

Memory Breakdown

Component	Size
Weights (Q3_K_M)	~16.5 GB
KV Cache (64K)	~4.5 GB
Total	~21 GB
VRAM Headroom	~3 GB

Model Details

Base Model: Qwen3.5-27B (Alibaba Cloud)
Quantization: Q3_K_M
Architecture: qwen35
Parameters: 26.9B
Max Context: 65,536
Capabilities: completion, tools, thinking

Comparison with Official Version

Feature	Official `qwen3.5:27b`	This Model
Quantization	Q4_K_M	Q3_K_M
Default Context	32,768	65,536
Total Size	~25 GB	~21 GB
GPU Load	84% GPU / 16% CPU	100% GPU
Tools Support	✅	✅

Build from Source

ollama create qwen3.5-27b-64k-tools -f Modelfile

License

This model is based on Qwen3.5-27B by Alibaba Cloud, licensed under Apache License 2.0. The Q3_K_M GGUF weights are derived from the community conversion by bartowski.

# Qwen3.5-27B 64K-Tools

A customized distribution of **Qwen3.5-27B** with three key modifications:

1. **Extended Context** — 64K tokens (default 4K → 65,536)
2. **Tool Use Enabled** — Native function calling via official Qwen3.5 renderer/parser
3. **100% GPU on 24GB** — Fits entirely on RTX 3090 / 4090 / A5000

## Quick Start

```bash
ollama pull jedi-knight/qwen3.5-27b-64k-tools
ollama run jedi-knight/qwen3.5-27b-64k-tools
```

## Hardware Requirements

| GPU | VRAM | Status |
|-----|------|--------|
| RTX 3090 | 24 GB | ✅ 100% GPU |
| RTX 4090 | 24 GB | ✅ 100% GPU |
| RTX A5000 | 24 GB | ✅ 100% GPU |
| RTX 4080 | 16 GB | ❌ Requires CPU offload |

## Memory Breakdown

| Component | Size |
|-----------|------|
| Weights (Q3_K_M) | ~16.5 GB |
| KV Cache (64K) | ~4.5 GB |
| **Total** | **~21 GB** |
| VRAM Headroom | ~3 GB |

## Model Details

- **Base Model**: [Qwen3.5-27B](https://huggingface.co/Qwen/Qwen3.5-27B) (Alibaba Cloud)
- **Quantization**: Q3_K_M
- **Architecture**: qwen35
- **Parameters**: 26.9B
- **Max Context**: 65,536
- **Capabilities**: `completion`, `tools`, `thinking`

## Comparison with Official Version

| Feature | Official `qwen3.5:27b` | This Model |
|---------|------------------------|------------|
| Quantization | Q4_K_M | Q3_K_M |
| Default Context | 32,768 | 65,536 |
| Total Size | ~25 GB | ~21 GB |
| GPU Load | 84% GPU / 16% CPU | 100% GPU |
| Tools Support | ✅ | ✅ |

## Build from Source

```bash
ollama create qwen3.5-27b-64k-tools -f Modelfile
```

## License

This model is based on **Qwen3.5-27B** by Alibaba Cloud, licensed under [Apache License 2.0](LICENSE).
The Q3_K_M GGUF weights are derived from the community conversion by [bartowski](https://huggingface.co/bartowski/Qwen_Qwen3.5-27B-GGUF).

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)