is this one fit for vllm deployment?

#4
by hulianxue - opened

as title.

I want to use it in my project. is it fit for vllm environment deployment?

seems it cost more gpu than other ones which have same size(70b) and same quant type (gptq), which could lead to GPU OOM issue in a H100 or A100 device.

doesn't seem so, I couldn't run it on a dual gpu of 64gb of vram.

Sign up or log in to comment