Incorrect new line token in vocabulary

by giladgd - opened Feb 17

Feb 17

The new line token in the vocabulary for the converted files is "Ä" instead of being "\n", which causes the model to be fed with incorrect input when providing input that contains line breaks and, consequently, outputs bad completions for such inputs.

giladgd

Feb 17

Do you have a suggestion on how to fix it or a workaround?

giladgd

Feb 17

Nevermind, there was an issue on my side

giladgd changed discussion status to closed Feb 17

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment