On the "large" version of the model

#2
by mutiann - opened

Hello,

It is very valuable to have a pytorch version of this model and thank you very much for that! While I am trying to use the large version of the original model (https://huggingface.co/hyperonym/xlm-roberta-longformer-large-16384). Do you have any interest to create it? Or would you mind sharing the method for converting the original TF model into this pytorch version, so that I could try by myself?

Thank you!

Hi @mutiann ,

thanks for reaching out. I have just uploaded the PyTorch version of the large model as well: https://huggingface.co/severinsimmler/xlm-roberta-longformer-large-16384

Converting TensorFlow to PyTorch is actually quite straightforward:

from transformers import AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("hyperonym/xlm-roberta-longformer-large-16384")
model = AutoModel.from_pretrained("hyperonym/xlm-roberta-longformer-large-16384", from_tf=True)

tokenizer.save_pretrained("xlm-roberta-longformer-large-16384")
model.save_pretrained("xlm-roberta-longformer-large-16384")

Please note that I haven't tried fine-tuning the large model so far, so no guarantee that the conversion didn't break anything (tried out fine-tuning the base model some time ago though and performed really well).

Thank you very much! I will have a try.

mutiann changed discussion status to closed

Sign up or log in to comment