Proper exllamav2 github branch

#1
by clearcash - opened

Hi @bullerwins ,

I wanted to ask if you knew which dev branch in the exllamav2 github repo is the correct one to use for fixing the RoPE issue. I noticed there was a dev and dev_tp branch. The dev_tp branch has been updated more recently but I am not sure if it is a stable version. Again thanks for your time!

At the moment the changes needed have been merged, so just use the exl2 master branch of 0.1.8 release

I quantized a model using the main branch and the quantization went smoothly. However, when attempting to run the quantization using the script below it failed to run. The process started but I never got an output from the command it just kept loading something. Do you have any advice on how I can properly quantize and run these models?

!python exllamav2/test_inference.py -m quant/ -p "Prompt"

I also wanted to ask how I can run the models you made using ExLlama 2. Thanks!

Sign up or log in to comment