Weird Output "emphat emphat emphat"
#82
by
bill13031
- opened
Hi there, I tried gemma-7b-it using bfloat16, float16, and float32, they all give out weird but the same outputs.
I've tried with and without pipeline.
My code:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig, pipeline
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b-it", torch_dtype=torch.bfloat16, device_map = 'auto')
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b-it", use_fast = False)
generation_config = GenerationConfig.from_pretrained("google/gemma-7b-it")
text_pipeline = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
generation_config=generation_config,
device_map='auto')
text = "Write me a poem about Machine Learning"
chat = [
{'role' : 'user', 'content': text}
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
print(prompt)
outputs = text_pipeline(prompt, add_special_tokens=False)
print(outputs)
Output
prompt:
<bos><start_of_turn>user
Write me a poem about Machine Learning<end_of_turn>
<start_of_turn>model
outputs
[{'generated_text': '<bos><start_of_turn>user\nWrite me a poem about Machine Learning<end_of_turn>\n<start_of_turn>model\n emphat emphat emphat emphat'}]
I downgrade the PyTorch from >=2.2 to <= 1.13, and related packages. Then the above error is solved.
However, when using PyTorch >=2.2, the output is fine if only a GPU is utilized for deploying.
I infer that the error is due to Accelerate.