Update configuration_llama.py, required to get model to Load as `rope_scaling` needs to be None, or else a dictionary
#1
by
TheBloke
- opened
I checked how NousResearch YaRN models did it, and they have rope_scaling=None
defined in this Python file. Then that will be overriden by the values specified in your config.json:
Before this PR I couldn't load the model, but this PR enables me to load it - below confirms that correct rope_scaling
is then applied:
In [1]: from transformers import AutoModelForCausalLM, AutoTokenizer
In [2]: model = AutoModelForCausalLM.from_pretrained("/workspace/process/ddh0_norocetacean-20b-10k/source" , low_cpu_mem_usage=True, trust_remote_code=True)
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 5/5 [00:08<00:00, 1.77s/it]
In [4]: model.config.rope_scaling
Out[4]:
{'factor': 2.5,
'original_max_position_embeddings': 4096,
'type': 'yarn',
'finetuned': False}
In [5]:
ddh0
changed pull request status to
merged