Text Generation
Transformers
PyTorch
Safetensors
English
llama
text-generation-inference
Inference Endpoints

How could I use this model without using Pipeline

#6
by panzeyu2013 - opened

I want to use the model as one part thus needs full access to the model. I have also noticed that model.generate method generate absurd sentences. Is this because different tokenizer settings?

I have a Chat model UI with vLLM instead of huggingface, you can run it with an ngrok auth tok here: https://colab.research.google.com/drive/1OaWYiHBt-nkSNCik6H0lhAWcpLCYvauq#scrollTo=oHb4LKvLy5aD

Sign up or log in to comment