Longer context

#10

by salazaaar - opened Jul 18

Discussion

salazaaar

Jul 18

The model is currently restricted to a context of 4096 tokens. Is there a way to extend it without retraining?

Haoxiang-Wang

RLHFlow org Jul 18

@salazaaar I think the model supports context window of 8192 (see config.json - "max_position_embeddings": 8192).
8192 is what the native Llama3 model supports. If you want to go beyond that without training, you can try methods such as
https://arxiv.org/abs/2308.16137
https://arxiv.org/abs/2309.17453

Haoxiang-Wang changed discussion status to closed Jul 18

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment