Text Generation
Transformers
PyTorch
JAX
Safetensors
bloom
text-generation-inference
Inference Endpoints

Correct Prompt format for Chat Response of bloom-7b1

#49
by Saugatkafley - opened

I wanted to know the correct prompt formatting for bloom-7b1. I have tried some of the prompts :\n
Eg:

import time
start = time.time()
query="""Given a question delimitted by triplebackticks. \nUser: ```के आफूले मन पराएको व्यक्तिले राखेको यौन सम्पर्कको मागलाई स्वीकार्नु ठीक हो ?```\nAssistant:"""

inputs = tokenizer.encode(query , return_tensors="pt").to("cuda")
outputs = model.generate(inputs , max_new_tokens = 200 , temperature = 0.5, do_sample=True)
ans = tokenizer.batch_decode(outputs , skip_special_tokens = True)[0]
print( ans[len(query):])
print("inferecne time",  time.time() - start)

The answer extremely hallucinates. What can be done?

I'm really surprised this has not been answered.
I'm just getting into this, and I'm finding that with the default available setting in the "text-generation-webui-main" system to be very one worded answers to the chat style prompts.
I'm struggling to find any information on the correct usage for these models, and it's frustrating because I'd like to be able to experiment with this system.

So please, if anyone reading this in the future knows the correct way to interact with these models, then please let us know!
image.png

BigScience Workshop org

This is to be expected as BLOOM was not intruction tuned.

christopher changed discussion status to closed

Sign up or log in to comment