image-captioning-Vit-GPT2-Flickr8k
This model is a fine-tuned version of nlpconnect/vit-gpt2-image-captioning on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4624
- Rouge1: 38.4609
- Rouge2: 14.1268
- Rougel: 35.4304
- Rougelsum: 35.391
- Gen Len: 12.1355
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.5495 | 0.06 | 500 | 0.4942 | 35.0812 | 11.7357 | 32.4228 | 32.4251 | 11.5738 |
0.4945 | 0.12 | 1000 | 0.4903 | 35.4943 | 12.0207 | 32.8571 | 32.8486 | 11.8682 |
0.4984 | 0.19 | 1500 | 0.4862 | 35.3652 | 11.9707 | 32.8296 | 32.8126 | 12.0544 |
0.4783 | 0.25 | 2000 | 0.4808 | 36.1048 | 12.3597 | 33.4635 | 33.4504 | 11.3468 |
0.4736 | 0.31 | 2500 | 0.4772 | 35.9342 | 12.343 | 33.519 | 33.495 | 11.1066 |
0.4685 | 0.37 | 3000 | 0.4708 | 36.8985 | 13.0743 | 34.3294 | 34.2978 | 11.4739 |
0.4687 | 0.43 | 3500 | 0.4704 | 36.1934 | 12.5721 | 33.4731 | 33.4671 | 11.9201 |
0.4709 | 0.49 | 4000 | 0.4696 | 36.1822 | 12.8306 | 33.4001 | 33.3673 | 12.1733 |
0.4575 | 0.56 | 4500 | 0.4675 | 37.4471 | 13.7553 | 34.5655 | 34.5384 | 12.6302 |
0.4484 | 0.62 | 5000 | 0.4662 | 36.6786 | 13.0601 | 33.9348 | 33.8999 | 12.6007 |
0.4507 | 0.68 | 5500 | 0.4656 | 36.506 | 12.7992 | 34.0665 | 34.0409 | 11.4316 |
0.4445 | 0.74 | 6000 | 0.4628 | 37.0737 | 13.3324 | 34.416 | 34.3902 | 12.3211 |
0.4557 | 0.8 | 6500 | 0.4594 | 37.3349 | 13.1633 | 34.4709 | 34.4503 | 12.2522 |
0.4451 | 0.87 | 7000 | 0.4600 | 37.3384 | 13.5699 | 34.6726 | 34.6555 | 12.0494 |
0.4381 | 0.93 | 7500 | 0.4588 | 37.6164 | 13.7855 | 34.8467 | 34.8084 | 12.1347 |
0.4357 | 0.99 | 8000 | 0.4571 | 37.2047 | 13.4341 | 34.3383 | 34.3121 | 12.2670 |
0.3869 | 1.05 | 8500 | 0.4612 | 37.684 | 13.6922 | 34.9914 | 34.9721 | 11.3216 |
0.377 | 1.11 | 9000 | 0.4616 | 37.2615 | 13.2059 | 34.3375 | 34.3327 | 12.3221 |
0.3736 | 1.17 | 9500 | 0.4607 | 37.2109 | 13.1387 | 34.3923 | 34.3638 | 11.8274 |
0.3801 | 1.24 | 10000 | 0.4617 | 38.0033 | 13.7561 | 35.2434 | 35.2414 | 11.6079 |
0.3816 | 1.3 | 10500 | 0.4599 | 37.3453 | 13.622 | 34.6495 | 34.639 | 12.2101 |
0.377 | 1.36 | 11000 | 0.4619 | 37.2996 | 13.4583 | 34.3777 | 34.3525 | 12.3911 |
0.3745 | 1.42 | 11500 | 0.4604 | 37.5448 | 13.3841 | 34.5785 | 34.5532 | 12.2747 |
0.3785 | 1.48 | 12000 | 0.4568 | 38.0769 | 14.0089 | 35.0744 | 35.0605 | 12.3179 |
0.3675 | 1.54 | 12500 | 0.4587 | 37.6284 | 13.8277 | 34.7837 | 34.7618 | 11.8732 |
0.3731 | 1.61 | 13000 | 0.4554 | 38.433 | 14.1461 | 35.6757 | 35.6683 | 11.4294 |
0.3731 | 1.67 | 13500 | 0.4548 | 37.9065 | 13.7526 | 34.9091 | 34.8919 | 12.1241 |
0.371 | 1.73 | 14000 | 0.4542 | 38.4064 | 14.2136 | 35.4845 | 35.4671 | 12.1014 |
0.3615 | 1.79 | 14500 | 0.4551 | 38.0695 | 14.1042 | 35.162 | 35.1427 | 12.1135 |
0.3687 | 1.85 | 15000 | 0.4550 | 38.1978 | 14.1243 | 35.3107 | 35.2821 | 12.2255 |
0.3711 | 1.92 | 15500 | 0.4532 | 37.661 | 13.603 | 34.7601 | 34.7467 | 12.1632 |
0.3685 | 1.98 | 16000 | 0.4515 | 38.5727 | 14.5345 | 35.5855 | 35.5585 | 11.9162 |
0.3333 | 2.04 | 16500 | 0.4626 | 38.4657 | 14.4726 | 35.6431 | 35.6119 | 11.9506 |
0.3129 | 2.1 | 17000 | 0.4660 | 38.2002 | 14.0689 | 35.1851 | 35.1748 | 12.3313 |
0.3155 | 2.16 | 17500 | 0.4674 | 37.8919 | 13.91 | 34.9167 | 34.9154 | 12.4853 |
0.3134 | 2.22 | 18000 | 0.4644 | 38.1576 | 13.9371 | 35.0486 | 35.0252 | 11.9748 |
0.3167 | 2.29 | 18500 | 0.4653 | 37.8516 | 13.9029 | 34.7959 | 34.7847 | 12.5273 |
0.322 | 2.35 | 19000 | 0.4673 | 37.9883 | 14.0127 | 34.8667 | 34.841 | 12.4680 |
0.312 | 2.41 | 19500 | 0.4641 | 38.4611 | 14.238 | 35.4465 | 35.417 | 11.9315 |
0.3173 | 2.47 | 20000 | 0.4654 | 38.1477 | 13.9164 | 35.1148 | 35.0905 | 12.4845 |
0.3081 | 2.53 | 20500 | 0.4640 | 38.7153 | 14.3282 | 35.7048 | 35.6923 | 11.8932 |
0.3093 | 2.6 | 21000 | 0.4633 | 38.2932 | 14.0961 | 35.2736 | 35.2308 | 11.8932 |
0.3154 | 2.66 | 21500 | 0.4637 | 38.0708 | 13.7374 | 35.0722 | 35.055 | 12.1310 |
0.3096 | 2.72 | 22000 | 0.4630 | 38.3722 | 14.041 | 35.2847 | 35.2425 | 12.2591 |
0.3101 | 2.78 | 22500 | 0.4627 | 38.6372 | 14.2961 | 35.5118 | 35.4819 | 12.2836 |
0.309 | 2.84 | 23000 | 0.4620 | 38.3596 | 14.0396 | 35.3285 | 35.3 | 12.3281 |
0.312 | 2.9 | 23500 | 0.4623 | 38.4268 | 14.0768 | 35.4015 | 35.3656 | 12.2208 |
0.3135 | 2.97 | 24000 | 0.4624 | 38.4609 | 14.1268 | 35.4304 | 35.391 | 12.1355 |
Framework versions
- Transformers 4.39.3
- Pytorch 2.1.2
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 7
Inference API (serverless) does not yet support transformers models for this pipeline type.