Orion-14B: Open-source Multilingual Large Language Models
Abstract
In this study, we introduce Orion-14B, a collection of multilingual large language models with 14 billion parameters. We utilize a data scheduling approach to train a foundational model on a diverse corpus of 2.5 trillion tokens, sourced from texts in English, Chinese, Japanese, Korean, and other languages. Additionally, we fine-tuned a series of models tailored for conversational applications and other specific use cases. Our evaluation results demonstrate that Orion-14B achieves state-of-the-art performance across a broad spectrum of tasks. We make the Orion-14B model family and its associated code publicly accessible https://github.com/OrionStarAI/Orion, aiming to inspire future research and practical applications in the field.
Community
Looks good, but the licence is misleading. GitHub says Apache, but read further and you'll see only the code is Apache. The models weights require a commercial licence. The license is also revocable.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- TeleChat Technical Report (2024)
- DeepSeek LLM: Scaling Open-Source Language Models with Longtermism (2024)
- PersianMind: A Cross-Lingual Persian-English Large Language Model (2024)
- YAYI 2: Multilingual Open-Source Large Language Models (2023)
- LLaMA Beyond English: An Empirical Study on Language Capability Transfer (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper