Not directly. However, quantized models can be loaded using AutoGPTQ or bitsandbytes before being executed in vLLM hosting.

Was this answer helpful? 0 Users Found This Useful (0 Votes)