Training LLMs over Neurally Compressed Text
Paper
•
2404.03626
•
Published
•
21
Table 5: Transformers struggle to learn Arithmetic Coding. In the sequence-to-sequence setting, a model that learns AC compression/decompression shoul