Max Context length?

by lazyDataScientist - opened

Just wondering what the max context length for this model has at the moment

It doesn’t have a hard-coded max context length like a transformer. It works kind of like a LSTM. You can just keep adding input and it will keep going. It “remembers” the mast context selectively so it doesn’t loose too much performance.


Here they talk about it in the part about synthetic tasks

Sign up or log in to comment