yasserrmd/ Llama-4-Scout-17B-16E-Instruct:latest

624 Downloads Updated 10 months ago

tools

ollama run yasserrmd/Llama-4-Scout-17B-16E-Instruct

curl http://localhost:11434/api/chat \
  -d '{
    "model": "yasserrmd/Llama-4-Scout-17B-16E-Instruct",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='yasserrmd/Llama-4-Scout-17B-16E-Instruct',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'yasserrmd/Llama-4-Scout-17B-16E-Instruct',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 10 months ago

10 months ago

b6cf35b8f35b · 43GB ·

model

archllama4

parameters108B

quantizationQ2_K

43GB

params

{ "stop": [ "<|start_header_id|>", "<|end_header_id|>", "<|eot_id|>"

65B

template

{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|> {{- if .System }} {{ .System }

1.5kB

Readme

Llama-4-Scout-17B-16E-Instruct

Quantized version of Llama-4-Scout-17B-16E-Instruct, by Unsloth, optimized for instruction tasks and runs via Ollama.

Run

ollama run yasserrmd/Llama-4-Scout-17B-16E-Instruct

⚠️ Requires ~43 GiB system RAM even with Q2_K_XL quantization.

Notes

Format: Q2_K_XL (lightweight GGUF quant)
Good for Q&A, summaries, code, and chat
Built with Unsloth for efficient fine-tuning

Example

> Explain RAG in simple terms.