runtime error
οΏ½ | 3.54G/4.96G [00:33<00:10, 131MB/s][A pytorch_model-00002-of-00002.bin: 79%|ββββββββ | 3.90G/4.96G [00:34<00:06, 168MB/s][A pytorch_model-00002-of-00002.bin: 84%|βββββββββ | 4.14G/4.96G [00:35<00:04, 178MB/s][A pytorch_model-00002-of-00002.bin: 88%|βββββββββ | 4.38G/4.96G [00:38<00:03, 145MB/s][A pytorch_model-00002-of-00002.bin: 100%|ββββββββββ| 4.96G/4.96G [00:39<00:00, 126MB/s] Downloading shards: 100%|ββββββββββ| 2/2 [01:42<00:00, 49.40s/it][A Downloading shards: 100%|ββββββββββ| 2/2 [01:42<00:00, 51.48s/it] config.json: 0%| | 0.00/4.61k [00:00<?, ?B/s][A config.json: 100%|ββββββββββ| 4.61k/4.61k [00:00<00:00, 11.2kB/s] config.json: 0%| | 0.00/4.61k [00:00<?, ?B/s][A config.json: 100%|ββββββββββ| 4.61k/4.61k [00:00<00:00, 65.6kB/s] Traceback (most recent call last): File "/home/user/app/app.py", line 143, in <module> handler = Chat(model_path, conv_mode=conv_mode, load_8bit=load_8bit, load_4bit=load_8bit, device=device) File "/home/user/app/llava/serve/gradio_utils.py", line 56, in __init__ self.tokenizer, self.model, processor, context_len = load_pretrained_model(model_path, model_base, model_name, File "/home/user/app/llava/model/builder.py", line 114, in load_pretrained_model model = LlavaLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs) File "/home/user/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained ) = cls._load_pretrained_model( File "/home/user/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3002, in _load_pretrained_model raise ValueError( ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format.
Container logs:
Fetching error logs...