Sagemaker deployment error
How to import tokenizer for sagemaker to deploy model?
ValueError: Tokenizer class SEABPETokenizer does not exist or is not currently imported.
Hi,
Thank you for your interest in SEA-LION.
The SEA-LION tokenizer and model requires code execution and therefore requires the trust_remote_code
to be set to True
.
I do not have much experience with Sagemaker, however there is a passage in the documentation which might be relevant.
https://sagemaker.readthedocs.io/en/stable/overview.html#deploy-foundation-models-to-sagemaker-endpoints
For gated models on Hugging Face Hub, request access and pass the associated key as the environment variable HUGGING_FACE_HUB_TOKEN. Some Hugging Face models may require trusting of remote code, so set HF_TRUST_REMOTE_CODE as an environment variable.
Could you kindly set the HF_TRUST_REMOTE_CODE
environment variable to True and see if this fixes your issue?
Thank you
Raymond
Hi Raymond,
Thank you for your quick support on my query!
I have tried your suggestion and attempted the documentation you provided but the sagemaker endpoint to sea lion still doesn't seem to establish. Perhaps these logs might be useful? It looks like the trust_remote_code is still false...
This is my model hub configuration:
hub = {
'HF_MODEL_ID':'aisingapore/sealion7b-instruct-nc',
'SM_NUM_GPUS': json.dumps(1),
"HF_TRUST_REMOTE_CODE": "True",
"trust_remote_code": "True",
"HUGGING_FACE_HUB_TOKEN" : "read token"
}
These are the logs from sagemaker:
#033[2m2024-02-22T03:06:04.686906Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Args { model_id: "aisingapore/sealion7b-instruct-nc", revision: None, validation_workers: 2, sharded: None, num_shard: Some(1), quantize: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, hostname: "container-0.local", port: 8080, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/tmp"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false }
#033[2m2024-02-22T03:07:10.282868Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Downloaded /tmp/models--aisingapore--sealion7b-instruct-nc/snapshots/eaf0b7163f8a4ce80cb2a2c8e6118f3e571f77e1/model-00002-of-00002.safetensors in 0:00:33.
#033[2m2024-02-22T03:07:16.527671Z#033[0m #033[31mERROR#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Error when initializing model
ValueError: Tokenizer class SEABPETokenizer does not exist or is not currently imported.
Error: ShardCannotStart
Hope we'll be able to get this working...
Thanks!
Hi,
My apologies for the late follow up.
Please find attached for a sample notebook on how to deploy SEA-LION on SageMaker,
https://drive.google.com/file/d/1FLTiUGbK519N0EHQArMd0wILzgFzj6Mp/view?usp=sharing
The model hub config should be:
hub = {
'HF_MODEL_ID':'aisingapore/sealion7b-instruct-nc',
'SM_NUM_GPUS': json.dumps(4),
'HF_MODEL_TRUST_REMOTE_CODE': json.dumps(True),
}
Hope this helps,
Raymond