DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models Paper • 2309.14509 • Published Sep 25, 2023 • 17
Chain-of-Verification Reduces Hallucination in Large Language Models Paper • 2309.11495 • Published Sep 20, 2023 • 38