runtime error
mple_instruct' python generate.py --base_model='philschmid/bart-large-cnn-samsum' python generate.py --base_model='philschmid/flan-t5-base-samsum' python generate.py --base_model='facebook/mbart-large-50-many-to-many-mmt' python generate.py --base_model='togethercomputer/GPT-NeoXT-Chat-Base-20B' --prompt_type='human_bot' --lora_weights='GPT-NeoXT-Chat-Base-20B.merged.json.8_epochs.57b2892c53df5b8cefac45f84d019cace803ef26.28' must have 4*48GB GPU and run without 8bit in order for sharding to work with infer_devices=False can also pass --prompt_type='human_bot' and model can somewhat handle instructions without being instruct tuned python generate.py --base_model=decapoda-research/llama-65b-hf --load_8bit=False --infer_devices=False --prompt_type='human_bot' python generate.py --base_model=h2oai/h2ogpt-oig-oasst1-256-6.9b No model defined yet Get OpenAssistant/reward-model-deberta-v3-large-v2 model Traceback (most recent call last): File "/home/user/app/app.py", line 1959, in <module> fire.Fire(main) File "/home/user/.local/lib/python3.10/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/user/.local/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/user/.local/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/home/user/app/app.py", line 290, in main go_gradio(**locals()) File "/home/user/app/app.py", line 541, in go_gradio smodel, stokenizer, sdevice = get_score_model(**all_kwargs) File "/home/user/app/app.py", line 523, in get_score_model smodel, stokenizer, sdevice = get_model(**score_all_kwargs) File "/home/user/app/app.py", line 403, in get_model device = get_device() File "/home/user/app/app.py", line 297, in get_device raise RuntimeError("only cuda supported") RuntimeError: only cuda supported
Container logs:
Fetching error logs...