RUN ON GPU

#1
by itay016 - opened

Hi, the model seems great. I'm trying to use it with the GPU, but unfortunately, it's defaulting to the CPU. Is there any guide available? In terms of the code, I followed the example with the GPU and of course installed CUDA and PyTorch.

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("yam-peleg/Hebrew-Gemma-11B")
model = AutoModelForCausalLM.from_pretrained("yam-peleg/Hebrew-Gemma-11B", device_map="auto")

input_text = "ืฉืœื•ื! ืžื” ืฉืœื•ืžืš ื”ื™ื•ื?"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids)
out = tokenizer.decode(outputs[0])
print(out)

Sign up or log in to comment