Model inference giving 503 error

#25

by DeepTreeTeam - opened Aug 22

Aug 22

Hello,
it's been more than a day since the serverless inference on this model keeps returning a 503 error code, saying "Service Unavailable". Is there a specific reason for the model being unavailable? When will it be back?
Thank you!

Chia438

Aug 27

The same...

Zubair2019

Sep 4

Same here any updates on the issue so far.

nbroad

Sep 14

•

edited Sep 14

This model was taken down from the inference API, but the fp8 version is available through NIM on DGX Cloud: https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct-FP8?dgx_inference=true

Cost for this is based on compute time.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment