with load_in_4bit it just generates <pad> tokens

#16
by NePe - opened

I used the example from the model card with the latest version of transformers.

You should use torch_dtype=torch.bfloat16 for it to work.

Thanks, this fixed my issue!

NePe changed discussion status to closed

Sign up or log in to comment