-
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 84 -
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Paper • 2404.10667 • Published • 17 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 24 -
DoRA: Weight-Decomposed Low-Rank Adaptation
Paper • 2402.09353 • Published • 26
Collections
Discover the best community collections!
Collections including paper arxiv:2404.01856
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 54 -
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 24 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 67 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 72
-
A Biomedical Entity Extraction Pipeline for Oncology Health Records in Portuguese
Paper • 2304.08999 • Published • 2 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 82 -
Robust Open-Vocabulary Translation from Visual Text Representations
Paper • 2104.08211 • Published • 1 -
Poro 34B and the Blessing of Multilinguality
Paper • 2404.01856 • Published • 13
-
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 80 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 26 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 65 -
Poro 34B and the Blessing of Multilinguality
Paper • 2404.01856 • Published • 13