DeepSeek-7B-1M

DeepSeek-7B-1M is a hybrid model combining Qwen2.5-1.5B-1M and DeepSeek-R1-Distill-Qwen-1.5B, designed for enhanced reasoning, long-context understanding, and structured output generation. This model is optimized for mathematical problem-solving, code generation, and natural language understanding.

🚀 Features

Combines the strengths of Qwen2.5-1.5B-1M and DeepSeek-R1-Distill-Qwen-1.5B
Improved mathematical reasoning and structured output generation
Extended context length handling up to 32K tokens
Optimized for logical inference, code completion, and technical problem-solving
Supports multi-turn conversations with enhanced coherence

📥 Installation

1️⃣ Install Ollama

If Ollama is not installed, install it using:

For macOS & Linux:

curl -fsSL https://ollama.com/install.sh | sh

For Windows (WSL required):

wsl --install
curl -fsSL https://ollama.com/install.sh | sh

For more details, check the official Ollama installation guide:
https://ollama.com/download

🔧 Running the Model

2️⃣ Pull the Model

After installing Ollama, download DeepSeek-7B-1M:

ollama pull myrepo/deepseek-7b-1m

3️⃣ Run the Model

To start generating responses:

ollama run myrepo/deepseek-7b-1m

For interactive chat mode:

ollama chat myrepo/deepseek-7b-1m

📌 Customizing the Model

Using a Custom `Modelfile`

You can adjust model behavior using a Modelfile.

Create a new file named Modelfile and add the following:

FROM myrepo/deepseek-7b-1m

PARAMETER temperature 0.7
PARAMETER top_p 0.9

SYSTEM "You are an AI expert trained for advanced reasoning, coding, and mathematical problem-solving. Provide detailed, structured, and optimized responses."

Build and run your custom model:

ollama create deepseek-7b-custom -f Modelfile
ollama run deepseek-7b-custom

🎯 Performance Optimization

For the best performance: - Run Ollama on a GPU-enabled system - Use quantized versions (e.g., fp16, int4) for efficient inference - Deploy on high-memory cloud instances (32GB RAM or more)

📄 License

This model is released under the MIT License, ensuring open and accessible AI development.

🔗 Resources

Hugging Face Repository: (if applicable)
Ollama Documentation: https://ollama.com/docs
GitHub Repository: (if applicable)

🙌 Acknowledgments

DeepSeek-7B-1M is built by merging Qwen2.5-1.5B-1M and DeepSeek-R1-Distill-Qwen-1.5B, leveraging their strengths in math, structured reasoning, and technical NLP.

Readme