size of the vocabulary
#1
by
yuimo
- opened
what is the size of the vocabulary? and how to train the tokenizer? BPE or wordpiece?
65536
It's a greedy tokenizer
see https://github.com/BlinkDL/ChatRWKV/tree/main/tokenizer