A Biomedical Entity Extraction Pipeline for Oncology Health Records in Portuguese Paper • 2304.08999 • Published Apr 18, 2023 • 2
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 83
Robust Open-Vocabulary Translation from Visual Text Representations Paper • 2104.08211 • Published Apr 16, 2021 • 1
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model Paper • 2404.04167 • Published Apr 5 • 12
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models Paper • 2404.12387 • Published Apr 18 • 38
Adapting Safe-for-Work Classifier for Malaysian Language Text: Enhancing Alignment in LLM-Ops Framework Paper • 2407.20729 • Published Jul 30 • 25
Knesset-DictaBERT: A Hebrew Language Model for Parliamentary Proceedings Paper • 2407.20581 • Published Jul 30 • 23