142 3 months ago

Japanese instruction-tuned LLM by CyberAgent, distilled from Qwen-72B.

Models

View all →

Readme

DeepSeek-R1-Distill-Qwen-32B-Japanese (GGUF)

🧠 Japanese instruction-tuned LLM by CyberAgent, distilled from Qwen-72B.

🔹 Model Overview

  • Architecture: Qwen (transformer-based)
  • Size: 32B parameters (distilled)
  • Context length: 4096 tokens
  • Language: Japanese (native), English (partial)
  • Format: GGUF, quantized (e.g. q8_0)

🔹 Source

https://huggingface.co/cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese

🔹 License

  • License: MIT
  • Provider: CyberAgent, Inc.
  • See full terms: LICENSE

🔹 Tags

japanese, 32b, qwen, cyberagent, ollama, instruction-tuned

🔹 Notes

This model is suitable for Japanese-language instruction tasks, summarization, QA, etc.