60 1 month ago

Tencent WeDLM-7B-Base converted to GGUF (Q4_K_M). A text-diffusion model based on Qwen2.5 architecture, optimized for efficient parallel decoding.

ollama run doitmagic/wedlm-7b-base

Models

View all →

Readme

WeDLM-7B-Base (GGUF Quantized)

This model is a GGUF conversion of tencent/WeDLM-7B-Base, quantized to Q4_K_M for efficient local inference via Ollama.

Model Details

WeDLM (Web-enhanced Diffusion Language Model) is developed by Tencent. It is an advanced model that reconciles Diffusion Language Models with Standard Causal Attention, designed for fast inference.

  • Original Repo: tencent/WeDLM-7B-Base
  • Base Architecture: Qwen2.5-7B
  • Quantization: Q4_K_M (4-bit, medium - balanced quality/speed)
  • Context Length: 16k (Native), effective context depends on system resources.
  • License: Apache 2.0

Usage

You can run this model directly with Ollama:

ollama run doitmagic/wedlm-7b-base

Setup & Conversion

This model was converted and quantized by doITmagic using llama.cpp. It uses the qwen2 architecture definition to ensure compatibility with standard inference engines like Ollama.