Clarification: w2v-BERT 2.0 was first presented in SeamlessM4T v1 (not v2)
#21
by
zuazo
- opened
Please note that the w2v-BERT 2.0 model was initially introduced in the "SeamlessM4T v1" paper, specifically in Section 4.1, available at https://arxiv.org/abs/2308.11596.
While the "SeamlessM4T v2" paper also discusses this model, it does not delve into the same level of detail as the v1 paper.
Thanks for the note, would you like to open a hub PR to correct this ?
The architecture is the same, but it is trained on 4.5M hours of audio in Seamless v2 while in the v1 is trained of 1M of audio. And I think their only open sourced the weights for the v2.