Having problems generating text
This is my code which I am running on Kaggle :
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B")
model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B", torch_dtype=torch.float16)
model.save_pretrained("/kaggle/working/")
I am trying to run this inference code :
inp = tokenizer("Who was first man on moon", return_tensors = "pt")
print(inp)
output = model.generate(inp, top_p=0.95, top_k=60)
decode output
print("Output >>> " + tokenizer.decode(output[0], skip_special_tokens=True))
This returning the error as follows:
KeyError Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:267, in BatchEncoding.getattr(self, item)
266 try:
--> 267 return self.data[item]
268 except KeyError:
KeyError: 'shape'
During handling of the above exception, another exception occurred
Please help me . I am new to using LLM . I also have additional questions like is the model loaded to disk space and how to access it through kaggle. What would be RAM requirements to run a 7 billion parameter on Kaggle and can the free available GPU option run it for inference.
try to change your encoding part into this and try again:
output = model.generate(**inp, top_p=0.95, top_k=60)
There are two parts in the encoded text, the numbers and the attention mask. Try the one I mentioned.