Edit model card

Model details

This machine translation model can convert single sentences from and to any combination of the following languages:

ISO 693-3 Language name
eng English
ach Acholi
lgg Lugbara
lug Luganda
nyn Runyankole
teo Ateso

It was trained on the SALT dataset and a variety of additional external data resources, including back-translated news articles, FLORES-200, MT560 and LAFAND-MT. The base model was facebok/nllb-200-1.3B, with tokens adapted to add support for languages not originally included.

Usage example

tokenizer = transformers.NllbTokenizer.from_pretrained(
    'Sunbird/translate-nllb-1.3b-salt')
model = transformers.M2M100ForConditionalGeneration.from_pretrained(
    'Sunbird/translate-nllb-1.3b-salt')

text = 'Where is the hospital?'
source_language = 'eng'
target_language = 'lug'

language_tokens = {
    'eng': 256047,
    'ach': 256111,
    'lgg': 256008,
    'lug': 256110,
    'nyn': 256002,
    'teo': 256006,
}

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inputs = tokenizer(text, return_tensors="pt").to(device)
inputs['input_ids'][0][0] = language_tokens[source_language]
translated_tokens = model.to(device).generate(
    **inputs,
    forced_bos_token_id=language_tokens[target_language],
    max_length=100,
    num_beams=5,
)

result = tokenizer.batch_decode(
    translated_tokens, skip_special_tokens=True)[0]
# Eddwaliro liri ludda wa?

Evaluation metrics

Results on salt-dev:

Source language Target language BLEU
ach eng 28.371
lgg eng 30.45
lug eng 41.978
nyn eng 32.296
teo eng 30.422
eng ach 20.972
eng lgg 22.362
eng lug 30.359
eng nyn 15.305
eng teo 21.391
Downloads last month
3,312
Safetensors
Model size
1.37B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Sunbird/translate-nllb-1.3b-salt

Finetuned
(3)
this model

Dataset used to train Sunbird/translate-nllb-1.3b-salt