How to do continue-pre-training on the 7B-Instruct model?

by YalunHu - opened

To improve the code-generation/code-completion ability, I wanna do a continue-pre-training on this instructed version model, how should I make my pre-training data? Just add "<|endoftext|>" token at the end of each chunk of code-text?

Sign up or log in to comment