Microsoft Phi4 pulled from HuggingFace and quantized to Q8

Updated 9 months ago

9 months ago

d04922f21d98 · 16GB ·

archphi3

parameters14.7B

quantizationQ8_0

16GB

Microsoft. Copyright (c) Microsoft Corporation. MIT License Permission is hereby granted, free of ch

1.1kB

{ "stop": [ "<|im_start|>", "<|im_end|>", "<|im_sep|>" ] }

82B

{{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} <|im_start|>{{ .R

283B

Readme

This is Phi4 from Microsoft with a Q8 quantization. The ollama version is Q4_K_M.

This model was built doing the following:

With a ctx-size of 3200, the official model is taking around 10GB of VRAM while this Q8 version takes 16GB.