yam-peleg/Hebrew-Gemma-11B

Hi, the model seems great. I'm trying to use it with the GPU, but unfortunately, it's defaulting to the CPU. Is there any guide available? In terms of the code, I followed the example with the GPU and of course installed CUDA and PyTorch.

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("yam-peleg/Hebrew-Gemma-11B")
model = AutoModelForCausalLM.from_pretrained("yam-peleg/Hebrew-Gemma-11B", device_map="auto")

input_text = "שלום! מה שלומך היום?"
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids)
out = tokenizer.decode(outputs[0])
print(out)

yam-peleg
/

Hebrew-Gemma-11B

RUN ON GPU