Sagemaker deployment error

by Hullabaloo - opened Feb 21

Discussion

Hullabaloo

Feb 21

How to import tokenizer for sagemaker to deploy model?

ValueError: Tokenizer class SEABPETokenizer does not exist or is not currently imported.

RaymondAISG

AI Singapore org Feb 21

Hi,

Thank you for your interest in SEA-LION.
The SEA-LION tokenizer and model requires code execution and therefore requires the trust_remote_code to be set to True.

I do not have much experience with Sagemaker, however there is a passage in the documentation which might be relevant.
https://sagemaker.readthedocs.io/en/stable/overview.html#deploy-foundation-models-to-sagemaker-endpoints

For gated models on Hugging Face Hub, request access and pass the associated key as the environment variable HUGGING_FACE_HUB_TOKEN. Some Hugging Face models may require trusting of remote code, so set HF_TRUST_REMOTE_CODE as an environment variable.

Could you kindly set the HF_TRUST_REMOTE_CODE environment variable to True and see if this fixes your issue?
Thank you
Raymond

Hullabaloo

Feb 22

•

edited Feb 22

Hi Raymond,

Thank you for your quick support on my query!

I have tried your suggestion and attempted the documentation you provided but the sagemaker endpoint to sea lion still doesn't seem to establish. Perhaps these logs might be useful? It looks like the trust_remote_code is still false...

This is my model hub configuration:

hub = {
'HF_MODEL_ID':'aisingapore/sealion7b-instruct-nc',
'SM_NUM_GPUS': json.dumps(1),
"HF_TRUST_REMOTE_CODE": "True",
"trust_remote_code": "True",
"HUGGING_FACE_HUB_TOKEN" : "read token"
}

These are the logs from sagemaker:

#033[2m2024-02-22T03:06:04.686906Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Args { model_id: "aisingapore/sealion7b-instruct-nc", revision: None, validation_workers: 2, sharded: None, num_shard: Some(1), quantize: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, hostname: "container-0.local", port: 8080, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/tmp"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false }

#033[2m2024-02-22T03:07:10.282868Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Downloaded /tmp/models--aisingapore--sealion7b-instruct-nc/snapshots/eaf0b7163f8a4ce80cb2a2c8e6118f3e571f77e1/model-00002-of-00002.safetensors in 0:00:33.

#033[2m2024-02-22T03:07:16.527671Z#033[0m #033[31mERROR#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Error when initializing model

ValueError: Tokenizer class SEABPETokenizer does not exist or is not currently imported.

Error: ShardCannotStart

Hope we'll be able to get this working...

Thanks!

Hullabaloo changed discussion status to closed Feb 22

Hullabaloo changed discussion status to open Feb 22

RaymondAISG

AI Singapore org Feb 28

Hi,

My apologies for the late follow up.
Please find attached for a sample notebook on how to deploy SEA-LION on SageMaker,
https://drive.google.com/file/d/1FLTiUGbK519N0EHQArMd0wILzgFzj6Mp/view?usp=sharing

The model hub config should be:

hub = {
  'HF_MODEL_ID':'aisingapore/sealion7b-instruct-nc',
  'SM_NUM_GPUS': json.dumps(4),
  'HF_MODEL_TRUST_REMOTE_CODE': json.dumps(True),
}

Hope this helps,
Raymond

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment