fsmedberg/hermes-2-pro-llama-3

fsmedberg/

hermes-2-pro-llama-3

41 Downloads Updated 1 year ago

Hermes 2 Pro - Llama-3 70B (f16.q4 and .q5)

Name

2 models

Size

Context

Input

hermes-2-pro-llama-3:latest

45GB · 8K context window · Text · 1 year ago

hermes-2-pro-llama-3:latest

45GB

Text

hermes-2-pro-llama-3:f16.q5

53GB · 8K context window · Text · 1 year ago

hermes-2-pro-llama-3:f16.q5

53GB

Text

Two quantized models (Q4_K_M and Q5_K_M) of the Hermes 2 Pro Llama 3 70b model, inspired by instructions published by Robert Sinclair.

latest
llama-quantize --allow-requantize --output-tensor-type f16 --token-embedding-type f16 model.f16.gguf model.f16.q4.gguf q4_k

f16.q5
llama-quantize --allow-requantize --output-tensor-type f16 --token-embedding-type f16 model.f16.gguf model.f16.q5.gguf q5_k