Indeed. The majority of Phi models, such as Phi-3 and Phi-14B, come in GGUF (INT4/INT8) and AWQ (Weight-only quantization) formats, which minimize memory consumption without sacrificing acceptable performance.

Was this answer helpful? 0 Users Found This Useful (0 Votes)