--- tags: - Multilingual license: mit language: - af - am - ar - hy - as - ast - az - be - bn - bs - bg - my - ca - ceb - zho - hr - cs - da - nl - en - et - tl - fi - fr - ff - gl - lg - ka - de - el - gu - ha - he - hi - hu - is - ig - id - ga - it - ja - jv - kea - kam - kn - kk - km - ko - ky - lo - lv - ln - lt - luo - lb - mk - ms - ml - mt - mi - mr - mn - ne - ns - no - ny - oc - or - om - ps - fa - pl - pt - pa - ro - ru - sr - sn - sd - sk - sl - so - ku - es - sw - sv - tg - ta - te - th - tr - uk - umb - ur - uz - vi - cy - wo - xh - yo - zu --- ### Model Sources - **Paper**: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages - **Link**: https://arxiv.org/pdf/2407.05975 - **Repository**: https://github.com/CONE-MT/LLaMAX/ ### Model Description 🔥 LLaMAX-7B-X-NLI is a NLI model with multilingual capability, which is fully fine-tuned the powerful multilingual model [LLaMAX-7B](https://huggingface.co/LLaMAX/LLaMAX-7B) on MultiNLI dataset. 🔥 Compared with fine-tuning Llama-2 on the same setting, LLaMAX-7B-X-CSQA improves the average accuracy up to 5.6% on the XNLI dataset. ### Experiments | XNLI | Avg. | Sw | Ur | Hi | Th | Ar | Tr | El | Vi | Zh | Ru | Bg | De | Fr | Es | En | |-------------------|-------|------|------|------|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|------|------|------|------|------|------|------|------|------| | Llama2-7B-X-XNLI | 70.6 | 44.6 | 55.1 | 62.2 | 58.4 | 64.7 | 64.9 | 65.6 | 75.4 | 75.9 | 78.9 | 78.6 | 80.7 | 81.7 | 83.1 | 89.5 | | LLaMAX-7B-X-XNLI | 76.2 | 66.7 | 65.3 | 69.1 | 66.2 | 73.6 | 71.8| 74.3 | 77.4 | 78.3 | 80.3 | 81.6 | 82.2 | 83.0 | 84.1 | 89.7 | ### Model Usage Code Example: ```angular2html from transformers import AutoTokenizer, LlamaForCausalLM model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS) tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER) query = "Premise: She doesn’t really understand. Hypothesis: Actually, she doesn’t get it. Label:" inputs = tokenizer(query, return_tensors="pt") generate_ids = model.generate(inputs.input_ids, max_length=30) tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] # => Entailment ``` ### Citation if our model helps your work, please cite this paper: ``` @article{lu2024llamax, title={LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages}, author={Lu, Yinquan and Zhu, Wenhao and Li, Lei and Qiao, Yu and Yuan, Fei}, journal={arXiv preprint arXiv:2407.05975}, year={2024} } ```