Max Context length?
#5
by
lazyDataScientist
- opened
Just wondering what the max context length for this model has at the moment
It doesn’t have a hard-coded max context length like a transformer. It works kind of like a LSTM. You can just keep adding input and it will keep going. It “remembers” the mast context selectively so it doesn’t loose too much performance.
see: https://arxiv.org/pdf/2312.00752.pdf
Here they talk about it in the part about synthetic tasks