Fine-tuned from Phi-3-mini-4k-instruct on the Magicoder-OSS-Instruct-75K dataset (2K Python examples). Designed for coding Q&A — explains programming concepts with examples. Runs efficiently on consumer hardware via 4-bit QLoRA.

ollama run hanneselundstrom/phi-coding-instructor

curl http://localhost:11434/api/chat \
  -d '{
    "model": "hanneselundstrom/phi-coding-instructor",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='hanneselundstrom/phi-coding-instructor',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'hanneselundstrom/phi-coding-instructor',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 6 days ago

6 days ago

6aaceff6feb0 · 2.3GB ·

model

archllama

parameters3.82B

quantizationQ4_K_M

2.3GB

template

{{ if .System }}<|system|> {{ .System }}<|end|> {{ end }}{{ if .Prompt }}<|user|> {{ .Prompt }}<|end

149B

params

{ "min_p": 0.1, "stop": [ "<|end|>", "<|user|>", "<|assistant|>"

108B

Readme

phi-coding-instructor

Fine-tuned from Phi-3-mini-4k-instruct on the Magicoder-OSS-Instruct-75K dataset (2K Python examples) using QLoRA.

Fine-tuned on a single T4 GPU via Google Colab. ~30 minutes training time.

Training details

Base model: unsloth/Phi-3-mini-4k-instruct
Quantization: 4-bit (Q4_K_M)
LoRA rank: 16
Learning rate: 2e-4
Batch size: 8 (2 per device × 4 gradient accumulation)
Steps: 500
Dataset: 2,000 Python coding instruction/response pairs

Example

What is multi-head attention? → Explains the concept with context about Transformer architecture

Running locally

ollama run hanneselundstrom/phi-coding-instructor