Runtime error from Sagemaker deploy
Hi, can someone help me debug this?
I am trying to deploy wizard coder on Sagemaker using the recommended deploy code, I am getting the following error -
[RuntimeError: found uninitialized parameters in model : ['transformer.h.0.attn.c_attn.weight', 'transformer.h.0.attn.c_proj.weight', 'transformer.h.0.mlp.c_fc.weight', 'transformer.h.0.mlp.c_proj.weight', 'transformer.h.1.attn.c_attn.weight', ](RuntimeError: found uninitialized parameters in model : ['transformer.h.0.attn.c_attn.weight', 'transformer.h.0.attn.c_proj.weight', 'transformer.h.0.mlp.c_fc.weight', 'transformer.h.0.mlp.c_proj.weight', 'transformer.h.1.attn.c_attn.weight', 'transformer.h.1.attn.c_proj.weight', 'transformer.h.1.mlp.c_fc.weight', 'transformer.h.1.mlp.c_proj.weight', 'transformer.h.2.attn.c_attn.weight', 'transformer.h.2.attn.c_proj.weight', 'transformer.h.2.mlp.c_fc.weight', 'transformer.h.2.mlp.c_proj.weight', 'transformer.h.3.attn.c_attn.weight', 'transformer.h.3.attn.c_proj.weight', 'transformer.h.3.mlp.c_fc.weight', 'transformer.h.3.mlp.c_proj.weight'........
Recommended Deploy code -
import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri
try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'TheBloke/WizardCoder-15B-1.0-GPTQ',
'SM_NUM_GPUS': json.dumps(1)
}
create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"),
env=hub,
role=role,
)
deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
container_startup_health_check_timeout=300,
)
send request
predictor.predict({
"inputs": "My name is Julien and I like to",
})
I have no experience of Sagemaker at all I'm afraid. But I'm not sure this model is going to work. Firstly it's a GPTQ model, and I'm not sure if the Sagemaker code supports those?
Text Generation Inference, the Hugging Face inference library did recently add support for GPTQ. And maybe Sagemaker is using that? But I don't believe TGI supports Starcoder models (in GPTQ anyway), and this is a Starcoder model.
This Github Issue describes how to get my GPTQs working with Text Generation Inference - currently environment variables are required: https://github.com/huggingface/text-generation-inference/issues/601
But I've already had it reported to me that TGI doesn't work with Starcoder so even if Sagemaker is using TGI, I wouldn't expect it to work with this specific model. Try one of my Llama GPTQs instead.