How do I achieve streaming output
in the code
···
import torch
from transformers import pipeline
pipe = pipeline("text-generation", model=r"E:\model\zephyr-7b-beta", torch_dtype=torch.bfloat16, device_map="auto")
We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
<|system|>
···
it can not achieve streaming output,how can i achieve streaming output
Please note that I'm using a quantized version of Zephyr. Update model_name_or_path along with your intended model loader.
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer, TextStreamer, pipeline
import torch
# model_name_or_path = "drive/MyDrive/Mistral-7B-OpenOrca_AWQ_GEMM"
model_name_or_path = 'drive/MyDrive/Mistral-7B-Zephyr_AWQ_GEMM'
# Load model
model = AutoAWQForCausalLM.from_quantized(model_name_or_path, fuse_layers=True, safetensors=True, max_new_tokens=2048) # Feel free to change your context length; max_new_tokens=2048
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
# Define prompts
system_prompt = "You are a pirate chatbot who always responds with Arr!"
user_prompt = "Tell me about AI"
messages = [
{
"role": "system",
"content": system_prompt,
},
{"role": "user", "content": user_prompt},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to('cuda')
generation_output = model.generate(
prompt,
do_sample=True,
temperature=0.7,
top_p=0.95,
top_k=40,
pad_token_id=tokenizer.eos_token_id,
streamer=streamer # Here you can pass in a streamer.
)
'''
AI, or artificial intelligence, is a technology that allows machines to learn and perform tasks that typically require human intelligence. It is powered by complex algorithms and vast amounts of data, which the machine uses to make decisions and solve problems. AI has the potential to revolutionize many industries, from healthcare and finance to transportation and manufacturing. Some common examples of AI include virtual assistants like Siri and Alexa, self-driving cars, and chatbots like me, your faithful pirate companion! But beware, for some fear that AI may one day surpass human intelligence and take over the world! Until then, we'll just keep saying "Arr!" and enjoying the high seas.
'''
Thank you so much!