Model description
This is a translation model which translates text from English to Icelandic. It follows the architecture of the transformer model described in Attention is All You Need and was trained with fairseq for WMT24.
This is the base version of our model. See also: wmt24-en-is-transformer-base-deep, wmt24-en-is-transformer-big, wmt24-en-is-transformer-big-deep.
model | d_model | d_ff | h | N_enc | N_dec |
---|---|---|---|---|---|
Base | 512 | 2048 | 8 | 6 | 6 |
Base_deep | 512 | 2048 | 8 | 36 | 12 |
Big | 1024 | 4096 | 16 | 6 | 6 |
Big_deep | 1024 | 4096 | 16 | 36 | 12 |
How to use
from fairseq.models.transformer import TransformerModel
TRANSLATION_MODEL_NAME = 'checkpoint_best.pt'
TRANSLATION_MODEL = TransformerModel.from_pretrained('path/to/model', checkpoint_file=TRANSLATION_MODEL_NAME, bpe='sentencepiece', sentencepiece_model='sentencepiece.bpe.model')
src_sentences = ['This is a test sentence.', 'This is another test sentence.']
translated_sentences = TRANSLATION_MODEL.translate(src_sentences)
print(translated_sentences)
Eval results
We evaluated our data on the WMT21 test set. These are the chrF scores for our published models:
model | chrF |
---|---|
Base | 56.8 |
Base_deep | 57.1 |
Big | 57.7 |
Big_deep | 57.7 |
BibTeX entry and citation info
@inproceedings{jasonarson2024cogsinamachine,
year={2024},
title={Cogs in a Machine, Doing What They’re Meant to Do \\– The AMI Submission to the WMT24 General Translation Task},
author={Atli Jasonarson, Hinrik Hafsteinsson, Bjarki Ármannsson, Steinþór Steingrímsson},
organization={The Árni Magnússon Institute for Icelandic Studies}
}
Inference API (serverless) does not yet support fairseq models for this pipeline type.