RuntimeError: shape mismatch: value tensor of shape [64, 6144] cannot be broadcast to indexing result of shape [256, 6144]
hi,
/root/miniconda3/envs/internvl/lib/python3.9/site-packages/bitsandbytes/nn/modules.py:426: UserWarning: Input type into Linear4bit is torch.float16, but bnb_4bit_compute_dtype=torch.float32 (default). This will lead to slow inference or training speed.
warnings.warn(
Traceback (most recent call last):
File "/root/autodl-tmp/InternVL/InternVL-Chat-4bit.py", line 26, in
response = model.chat(tokenizer, pixel_values, question, generation_config)
File "/root/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-4bit/modeling_internvl_chat.py", line 309, in chat
generation_output = self.generate(
File "/root/miniconda3/envs/internvl/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-4bit/modeling_internvl_chat.py", line 353, in generate
input_embeds[selected] = vit_embeds.reshape(-1, C).to(input_embeds.device)
RuntimeError: shape mismatch: value tensor of shape [64, 6144] cannot be broadcast to indexing result of shape [256, 6144]
import torch
from PIL import Image
from transformers import AutoModel, CLIPImageProcessor
from transformers import AutoTokenizer
#path = "OpenGVLab/InternVL-Chat-Chinese-V1-2-Plus"
path = "/root/autodl-tmp/dl/InternVL-Chat-V1-5-4bit"
Load model directly
model = AutoModel.from_pretrained(path, ignore_mismatched_sizes=True, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
image = Image.open('./examples/image2.jpg').convert('RGB')
image = image.resize((448, 448))
image_processor = CLIPImageProcessor.from_pretrained(path)
pixel_values = image_processor(images=image, return_tensors='pt').pixel_values
pixel_values = pixel_values.to(torch.float16).cuda()
generation_config = dict(
num_beams=1,
max_new_tokens=512,
do_sample=False,
)
question = "请详细描述图片"
response = model.chat(tokenizer, pixel_values, question, generation_config)
Did you find any fix for this?
I met the same bug