Problem loading the tokenizer.
#1
by
nayohan
- opened
Hello, I encountered the following problem while loading the tokenizer.
save_llama = ['lcw99/zephykor-ko-beta-7b-chang']
for model_name in save_llama:
save_model_name = model_name.split('/')[-1]
print('save_model_name:', save_model_name)
tokenizer = LlamaTokenizer.from_pretrained(model_name)
tokenizer.save_pretrained(f"{save_path}/{save_model_name}")
model = AutoModel.from_pretrained(model_name)
model.save_pretrained(f"{save_path}/{save_model_name}")
save_model_name: zephykor-ko-beta-7b-chang
Traceback (most recent call last):
File "/home/closedai/.test/hybrid-ltm/src/download_model.py", line 35, in <module>
tokenizer = LlamaTokenizer.from_pretrained(model_name, use_fast=True)
File "/home/closedai/.conda/envs/sent/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2024, in from_pretrained
return cls._from_pretrained(
File "/home/closedai/.conda/envs/sent/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2256, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/closedai/.conda/envs/sent/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 178, in __init__
self.sp_model = self.get_spm_processor(kwargs.pop("from_slow", False))
File "/home/closedai/.conda/envs/sent/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 203, in get_spm_processor
tokenizer.Load(self.vocab_file)
File "/home/closedai/.conda/envs/sent/lib/python3.10/site-packages/sentencepiece/__init__.py", line 905, in Load
return self.LoadFromFile(model_file)
File "/home/closedai/.conda/envs/sent/lib/python3.10/site-packages/sentencepiece/__init__.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a string
I think the tokenizer.model file wasn't uploaded or is there anything I did wrong? The model loads fine.
try AutoTokenizer.
Changing to AutoTokenizer solved the problem very easily. Thank you! I'll give it a try :)
nayohan
changed discussion status to
closed