Longer context
#10
by
salazaaar
- opened
The model is currently restricted to a context of 4096 tokens. Is there a way to extend it without retraining?
@salazaaar
I think the model supports context window of 8192 (see config.json
- "max_position_embeddings": 8192
).
8192 is what the native Llama3 model supports. If you want to go beyond that without training, you can try methods such as
https://arxiv.org/abs/2308.16137
https://arxiv.org/abs/2309.17453
Haoxiang-Wang
changed discussion status to
closed