Spaces:
Running
How can I run this app in my own infra?
Any pointers on how I can run this "space" in my own infra? assuming I have sufficient hardware and the falcon-40b-instruct model downloaded, is there some code that I can use without relying on the HF text generation API?
You may use oobabooga's text generation UI. It gives you many options, including 4-bit transformer quantization, text streaming, easy training, and a lot more. It's a one click setup and is compatible with all pytorch text generation models. With falcon you will need to check the trust remote code box because the falcon repo includes python scripts. My 3060 runs falcon-7B at about one word a second on 4-bit
Great thanks. I am familiar with text generation UI. Just missed the thought that I could hook it up to Falcon. Have to make an API out of it vs. UI though as that is what I would like.
where to mark With falcon you will need to check the trust remote code box??
I am getting error as Loading models\tiiuae_falcon-7b requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option trust_remote_code=True
to remove this error. but where to add this? Please help ASAP