Fix `eos_token_id`
It seems eos_token_id
is <|end|>
(32007) instead of <|endoftext|>
(32000).
Context: https://twitter.com/altryne/status/1783567596467491109?t=k5HHVmTCGDt4-TkXF8KyNw&s=19
Please ensure that you are using the configuration defined in generation_config.json.
That config is pretty lazy though.
You made 3 token_ids being the potential eos_token now?
Hi
@gugarosa
, the <|endoftext|> token is only functional for base model's training, and <|end|> is supposed to be the EOS token for instruction model and for multi-turn conversation.
Please also refer to:
https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/commit/4d6c61da057c45bfc4dc4d3bfa5a691ecb9ce0cf
https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/commit/a8977699a3d0820e80129fb3c93c20fbd9972c41
generation_config.json is indeed a lazy move. Adding a token that will never be generated into "eos_token_id" is just not the correct way.
Please reconsider changing "eos_token_id" to "<|end|>. Thanks!