Edit model card

meta-nllb-600m-mt-en-twi-v4

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5793
  • Rouge1: 0.6092
  • Bleu: 22.4778

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Bleu
No log 1.0 480 4.1464 0.5420 15.3433
5.9598 2.0 960 1.8878 0.5603 16.8733
2.993 3.0 1440 0.7451 0.5753 18.6067
1.3619 4.0 1920 0.6291 0.5880 19.9709
0.888 5.0 2400 0.6059 0.5953 20.4567
0.7774 6.0 2880 0.5961 0.6000 21.1082
0.7358 7.0 3360 0.5907 0.6049 21.4798
0.6934 8.0 3840 0.5866 0.6068 21.6956
0.666 9.0 4320 0.5816 0.6058 21.8561
0.6533 10.0 4800 0.5799 0.6063 21.8737
0.6266 11.0 5280 0.5791 0.6078 22.1400
0.6063 12.0 5760 0.5792 0.6106 22.3387
0.6058 13.0 6240 0.5790 0.6072 22.2070
0.5786 14.0 6720 0.5777 0.6084 22.2723
0.5754 15.0 7200 0.5800 0.6079 22.2117
0.5707 16.0 7680 0.5784 0.6084 22.2791
0.557 17.0 8160 0.5790 0.6081 22.4436
0.5531 18.0 8640 0.5796 0.6097 22.5290
0.5464 19.0 9120 0.5797 0.6085 22.3927
0.5499 20.0 9600 0.5793 0.6092 22.4778

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Monsia/meta-nllb-600m-mt-en-twi-v4

Finetuned
(78)
this model