Can I hot swap for any encoder/decoder model?
Thanks for amazing research and publications!
I was reading the paper and it appears SLED allows to use any pretrained encoder-decoder model.
if name == 'main':
# Load the model and tokenizer for Bart-base-SLED
bart_base_sled_model = AutoModel.from_pretrained('tau/bart-base-sled')
tokenizer = AutoTokenizer.from_pretrained('tau/bart-base-sled')
bart_base_sled_model.eval()
In the code examples we're using bart_base_sled_model, how do you create these models?
Hi
@ljhwild
Thanks for your kind words!
Indeed, you can use any encoder-decoder model with SLED (though some custom models may require some minor tweaking to expose the required methods in the same interface BART and t5 do for example). The usage is simply by listing the name of the model/tokenizer so that SLED can load it. If it's already on GitHub then great, but you can also work with local config files if needed. You're absolutely right that I need to update the README here to make it clearer. Take a look at the repo https://github.com/mivg/sled for details how to use your own model, and open an issue there if you need any help and I'll try to assist you.