how test the model

by Soroor - opened Jul 25, 2023

Jul 25, 2023

Hi there,
first of all thanks for sharing this cool model,
I tried to test it but I couldn't get result, so I think the way that I tried to test it might be wrong, could you please guide me how can I simply test it or could you please check my code
here is the code that I used but it just re-write the query and nothing more!

from transformers import AutoTokenizer
import transformers
import torch

model = "beomi/llama-2-ko-7b"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map="auto",
)
query="whatever"
sequences = pipeline(
    query,
    do_sample=False,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=200,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

june42

Jul 25, 2023

This comment has been hidden

beomi

Owner Jul 25, 2023

Thanks for your attention!

it seems like it is working as intended, I've tested on Google Colab using your code but it seems working fine.
Here's demo colab link: https://colab.research.google.com/drive/1yw2wnge6iHfj7PO5VVDA3jkmliiOqQvd?usp=sharing

what i changed is 1 line of code - since this ckpt is consisted of BF16, you'll need to use torch_dtype=torch.bfloat16 or remove that line at all. (model's config contains about torch_dtype already) but actually it is not critical issue for running the model.

could you explain more detail about your env (python ver, pytorch ver, GPU, nvidia-driver version, cuda version, transformers/tokenizers/accelerate version)?

Soroor

Jul 25, 2023

Thank you for the prompt response, and thanks for your guidance.
It seems that the issue has been solved.
I also obtained the same result that you shared:
Loading checkpoint shards: 100%|██████████| 15/15 [00:26<00:00, 1.77s/it]
Result: How's the weather today? (오늘 날씨가 어떻습니까?) 10. 오늘 저녁에 뭐 할 겁니까? What are you doing tonight? 11. 몇 시에 퇴근합니까? What time do you get off? 12. 오늘은 몇 시에 출근합니까? What time are you coming to work today? 13. 어디를 가십니까? Where are you headed? 14. 당신은 무슨 일로 전화하셨습니까? May I help you, sir? 15. 이 옷은 어때요? How does this look on me? 16. 차 한 잔 어떻습니까? How about a cup of coffee? 17. 왜 저에게 그렇게 화를 내고 있습니까? Why are you so angry with me? 18. 나는 당신을 사랑합니다. I love you. 19. 나는 당신을 좋아합니다. I like you. 20. 당신은 참 정열적입니다. You...

It's working, but it seems to be generating questions and translations rather than general text generation.
Did you train the model to generate similar questions with their translations?
and I've noticed that it doesn't work well with larger texts and it's just re-write whole the given context again

and here are my current environment details:
python: 3.8.16
pytorch: 2.0.1+cu117
GPU: A100 80G
Nvidia-Driver Version: 495.29.05
CUDA Version: 11.5
transformers: 4.32.0.dev0
tokenizers: 0.13.3
accelerate: 0.20.3

thanks again!

beomi

Owner Aug 1, 2023

It would be sampling issue.
How about adding some temperatures and top-p sampling?
the phenomena shown is NOT intended since I trained the model with shuffled texts.

lqiao

Aug 16, 2023

I saw the same issue as reported by Soroor. I already used temperature 0.7, top-p 0.9.

lqiao

Aug 16, 2023

lqiao

Aug 17, 2023

Is there a good prompt template to use for chat?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment