Translation Issues
#7
by
plyons
- opened
Can anyone help me identify why I am unable to produce translations for longer inputs? I have a dataset of long texts that I know I will have to chunk. When I am testing however, I'm not able to produce translations for long sequences that are well under the max length of the model. My code snippet is below.
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = "bigscience/mt0-small"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
text = 'Translate the text following the colon to Spanish: I have a bunch of cats. I would like to go to the beach with them but cats do not like water. Should I take my dogs instead?'
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)
translated = model.generate(**inputs, max_new_tokens=512, max_length=512)
translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
print(translated_text)
@unbias
The output translation is No me gustaría tomar gatos. No me gustaría tomar gatos.
Also, what is the most appropriate way of knowing what the max length for generation? Should max_new_tokens be set to this value?