Error when deploying with Inference Endpoints

#1
by edgelesssys - opened

When trying to run this model with HF Inference Endpoints using TGI. The following error is shown:

[Server message]Endpoint failed to start
See details
Exit code: 3. Reason: vice GPU
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1128, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 825, in __getitem__
    raise KeyError(key)
KeyError: 'gemma2'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 732, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 608, in __aenter__
    await self._router.startup()
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 709, in startup
    await handler()
  File "/app/webservice_starlette.py", line 60, in some_startup_task
    inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
  File "/app/huggingface_inference_toolkit/handler.py", line 54, in get_inference_handler_either_custom_or_default_handler
    return HuggingFaceHandler(model_dir=model_dir, task=task)
  File "/app/huggingface_inference_toolkit/handler.py", line 18, in __init__
    self.pipeline = get_pipeline(
  File "/app/huggingface_inference_toolkit/utils.py", line 276, in get_pipeline
    hf_pipeline = pipeline(
  File "/usr/local/lib/python3.10/dist-packages/transformers/pipelines/__init__.py", line 815, in pipeline
    config = AutoConfig.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1130, in from_pretrained
    raise ValueError(
ValueError: The checkpoint you are trying to load has model type `gemma2` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

Application startup failed. Exiting.
VAGO solutions org

Hey @edgelesssys ,
you need to update your transformers library.
Best regards

DavidGF changed discussion status to closed

Sign up or log in to comment