1,965 6 months ago

`DeepSeekR1-QwQ-SkyT1-32B-Fusion` is a mixed model that combines the strengths of three powerful Qwen-based models: huihui-ai/DeepSeek-R1-Distill-Qwen-32B-abliterated, huihui-ai/QwQ-32B-Preview-abliterated and huihui-ai/Sky-T1-32B-Preview-abliterated

32b

6 months ago

8d1cd0a81f85 · 20GB

qwen2
·
32.8B
·
Q4_K_M
{{- if .System }}{{ .System }}{{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice
MIT License Copyright (c) 2023 DeepSeek Permission is hereby granted, free of charge, to any person
{ "stop": [ "<|begin▁of▁sentence|>", "<|end▁of▁sentence|>",

Readme

Although it’s a simple mix, the model is usable, and no gibberish has appeared. This is an experiment.

I test the 80:10:10, 70:15:15 and 60:20:20 ratios separately to see how much impact they have on the model.

References

HuggingFace