--- license: mit base_model: cointegrated/rut5-base-absum tags: - generated_from_trainer metrics: - rouge model-index: - name: flux-dsum results: [] --- # flux-dsum This model is a fine-tuned version of [cointegrated/rut5-base-absum](https://huggingface.co/cointegrated/rut5-base-absum) on the None dataset. It achieves the following results on the evaluation set: - Loss: 1.3535 - Rouge1: 0.3631 - Rouge2: 0.1695 - Rougel: 0.325 - Rougelsum: 0.3251 - Gen Len: 18.2008 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 4 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:| | 1.7402 | 1.0 | 21753 | 1.4456 | 0.3492 | 0.1601 | 0.3112 | 0.3114 | 18.0104 | | 1.59 | 2.0 | 43506 | 1.3912 | 0.3569 | 0.1616 | 0.3186 | 0.3187 | 18.1955 | | 1.5522 | 3.0 | 65259 | 1.3675 | 0.3607 | 0.1682 | 0.3231 | 0.3233 | 18.1123 | | 1.5162 | 4.0 | 87012 | 1.3535 | 0.3631 | 0.1695 | 0.325 | 0.3251 | 18.2008 | ### Framework versions - Transformers 4.36.0.dev0 - Pytorch 2.1.0+cu121 - Datasets 2.14.6 - Tokenizers 0.14.1