I have run tests to compare the speed of llama2-7b and llama2-7b-gptq with text-generation-inference(v1.0.1). The results show that the speed of llama2-7b-gptq is not better than llama-7b。
· Sign up or log in to comment