Model inference giving 503 error
#25
by
DeepTreeTeam
- opened
Hello,
it's been more than a day since the serverless inference on this model keeps returning a 503 error code, saying "Service Unavailable". Is there a specific reason for the model being unavailable? When will it be back?
Thank you!
The same...
Same here any updates on the issue so far.
This model was taken down from the inference API, but the fp8 version is available through NIM on DGX Cloud: https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct-FP8?dgx_inference=true