Incorrect new line token in vocabulary

#1
by giladgd - opened

The new line token in the vocabulary for the converted files is "Ä" instead of being "\n", which causes the model to be fed with incorrect input when providing input that contains line breaks and, consequently, outputs bad completions for such inputs.

Do you have a suggestion on how to fix it or a workaround?

Nevermind, there was an issue on my side

giladgd changed discussion status to closed

Sign up or log in to comment