Alibaba-NLP/qwen2-impl · Errors when using xformer implementation

Aug 28

Hi, thanks for you update. When using new implementation with transformers, i find following errors
model = cls(config, *model_args, **model_kwargs) TypeError: __init__() got an unexpected keyword argument 'unpad_inputs'

izhx

Alibaba-NLP org Aug 29

Have you done these two steps?

Download the configuration.py and modeling.py to your saved gte-Qwen2 model directory.

Replace the modeling_qwen. with modeling. in auto_map field of config.json.

https://huggingface.co/Alibaba-NLP/qwen2-impl#usage

le723z

Aug 29

•

edited Aug 29

Hi,

I just add two files and replace entry in config.json like this

{
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "auto_map": {
    "AutoModel": "modeling.Qwen2Model",
    "AutoModelForCausalLM": "modeling.Qwen2ForCausalLM",
    "AutoModelForSequenceClassification": "modeling.Qwen2ForSequenceClassification"
  },
  "bos_token_id": 151643,
  "eos_token_id": 151643,
  "hidden_act": "silu",
  "hidden_size": 1536,
  "initializer_range": 0.02,
  "intermediate_size": 8960,
  "max_position_embeddings": 131072,
  "max_window_layers": 21,
  "model_type": "qwen2",
  "num_attention_heads": 12,
  "num_hidden_layers": 28,
  "num_key_value_heads": 2,
  "rms_norm_eps": 1e-06,
  "rope_theta": 1000000.0,
  "sliding_window": 131072,
  "tie_word_embeddings": false,
  "torch_dtype": "float32",
  "transformers_version": "4.41.2",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 151646
}

however, i still got errors:

/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/transformers/utils/hub.py:127: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
A new version of the following files was downloaded from https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct:
- tokenization_qwen.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
Could not locate the modeling.py inside Alibaba-NLP/gte-Qwen2-1.5B-instruct.
Traceback (most recent call last):
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct/resolve/main/modeling.py

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/transformers/utils/hub.py", line 402, in cached_file
    resolved_file = hf_hub_download(
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
    return f(*args, **kwargs)
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1240, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1303, in _hf_hub_download_to_cache_dir
    (url_to_download, etag, commit_hash, expected_size, head_call_error) = _get_metadata_or_catch_error(
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1751, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1673, in get_hf_file_metadata
    r = _request_wrapper(
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 376, in _request_wrapper
    response = _request_wrapper(
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 400, in _request_wrapper
    hf_raise_for_status(response)
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 315, in hf_raise_for_status
    raise EntryNotFoundError(message, response) from e
huggingface_hub.utils._errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-66d07a3b-283b6d9e199d0dce328d4959;d38f0560-c846-4c20-bb8e-88bf25219e2c)

Entry Not Found for url: https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct/resolve/main/modeling.py.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/network/scratch/l/le.zhang/light_align/model/test.py", line 7, in <module>
    model = AutoModel.from_pretrained(
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 551, in from_pretrained
    model_class = get_class_from_dynamic_module(
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 502, in get_class_from_dynamic_module
    final_module = get_cached_module_file(
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 306, in get_cached_module_file
    resolved_module_file = cached_file(
  File "/home/mila/l/le.zhang/.conda/envs/openflamingo/lib/python3.9/site-packages/transformers/utils/hub.py", line 456, in cached_file
    raise EnvironmentError(
OSError: Alibaba-NLP/gte-Qwen2-1.5B-instruct does not appear to have a file named modeling.py. Checkout 'https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct/tree/main' for available files.

It seems to looking for modelling files in https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct/tree/main, do you have any suggestion ?