I periodically encounter infinite generations

#16
by hiauiarau - opened

I periodically encounter infinite generations in Qwen 2.5 7B Coder with FP8 quantization when feeding long texts around 20+k characters into the context.

I'm looking at their configs:
https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct/blob/main/config.json
https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct/blob/main/generation_config.json
https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct/blob/main/tokenizer_config.json

In general, they seem to be consistent across the entire line.
But I have a question: in config.json, "bos_token_id": 151643, which corresponds to "<|endoftext|>" according to the tokenizer, and "eos_token_id": 151645, which corresponds to "<|im_end|>". However, in generation_config.json, "bos_token_id": 151643 "<|endoftext|>" and "pad_token_id": 151643 "<|endoftext|>", and "eos_token_id": [151645, 151643] - a list of two tokens that were previously eos and bos tokens: "<|im_end|>" and "<|endoftext|>". Now, looking at tokenizer_config.json:
"bos_token": null, "eos_token": "<|im_end|>", "pad_token": "<|endoftext|>",
where the bos token should probably be explicitly 151644 - "<|im_start|>" instead of 151643, which is "<|endoftext|>".

In short, these three configs have completely confused me.

Hmm, I also found this: https://github.com/QwenLM/Qwen2.5-Coder
Important

We have updated both the special tokens and their corresponding token ids to maintain consistency with Qwen2.5. The new special tokens are as follows:
{
"<|fim_prefix|>": 151659,
"<|fim_middle|>": 151660,
"<|fim_suffix|>": 151661,
"<|fim_pad|>": 151662,
"<|repo_name|>": 151663,
"<|file_sep|>": 151664,
"<|im_start|>": 151644,
"<|im_end|>": 151645
}

How to properly modify config.json, generation_config.json, and tokenizer_config.json??

Sign up or log in to comment