Fix error when force_tokens includes multi-word sequence to preserve

#2

Right now an error occurs when force_tokens includes a sequence which contains spaces and is contained in the prompt.
This is caused by word, label = line.split(label_sep), where line is for example The answer is 1 (where "The answer is" is the sequence in force_tokens and 1 is the corresponding label).
The error is thrown because line.split(label_sep) returns ['The', 'answer', 'is', '1'], which is too many arguments to be unpacked into word, label.

The fix is to split only at the first occurence of label_sep from the right.

qianhuiwu changed pull request status to merged
Microsoft org

Thanks. Merge the fix for string split.

Sign up or log in to comment