Missing <|im_start|> from tokenizer_config.json

#3
by bartowski - opened

Curious why it's missing, it seems to break tokenization because it's not marked as being a special token

Adding it to tokenizer_config.json fixes my tokenization issue

Hi πŸ‘‹ @bartowski Hello, thank you very much! Could I see how you are specifically using it (for example, the inference code)? This would help us accurately reproduce your issue. Thanks again!

(closing this one to continue discussion in the PR)

bartowski changed discussion status to closed

Sign up or log in to comment