80.1K Downloads Updated 1 year ago
Updated 1 year ago
1 year ago
2ccfc126100e · 7.4GB ·
WizardLM is a 70B parameter model based on Llama 2 trained by WizardLM.
The model used in the example below is the WizardLM model, with 70b parameters, which is a general-use model.
ollama serve
)curl -X POST http://localhost:11434/api/generate -d '{
"model": "wizardlm:70b-llama2-q4_0",
"prompt":"Why is the sky blue?"
}'
ollama run wizardlm:70b-llama2-q4_0
Note: The ollama run
command performs an ollama pull
if the model is not already downloaded. To download the model without running it, use ollama pull wizardlm:70b-llama2-q4_0
If you run into issues with higher quantization levels, try using the q4 model or shut down any other programs that are using a lot of memory.
By default, Ollama uses 4-bit quantization. To try other quantization levels, please try the other tags. The number after the q represents the number of bits used for quantization (i.e. q4 means 4-bit quantization). The higher the number, the more accurate the model is, but the slower it runs, and the more memory it requires.
WizardLM source on Ollama
70b parameters source: The Bloke
70b parameters original source: WizardLM