Why the vocab_size of tokenizer is different from model?

by xianf - opened Mar 23, 2023

Mar 23, 2023

I am new to t5 and mt5. I found the vocab_size in tokenizer is 250100, but the model's embedding size is 250112. Could you tell me why they are different?
Thx in advance~

xianf changed discussion status to closed Mar 23, 2023

xianf

Mar 23, 2023

Got the answer.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment