Enhancing Training Efficiency Using Packing with Flash Attention Paper • 2407.09105 • Published Jul 12 • 14
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention Aug 21 • 22
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Paper • 2408.07055 • Published Aug 13 • 65