cnn_dailymail_8824_bart-base
This model is a fine-tuned version of facebook/bart-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.9201
- Rouge1: 0.2472
- Rouge2: 0.1256
- Rougel: 0.2063
- Rougelsum: 0.2331
- Gen Len: 20.0
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.2077 | 0.11 | 500 | 1.0668 | 0.2378 | 0.1128 | 0.1955 | 0.2228 | 20.0 |
1.1503 | 0.22 | 1000 | 1.0418 | 0.2376 | 0.1145 | 0.1964 | 0.223 | 20.0 |
1.1191 | 0.33 | 1500 | 1.0109 | 0.2409 | 0.1187 | 0.1995 | 0.2268 | 20.0 |
1.0828 | 0.45 | 2000 | 1.0048 | 0.2408 | 0.1192 | 0.2004 | 0.227 | 20.0 |
1.0546 | 0.56 | 2500 | 0.9911 | 0.2417 | 0.1206 | 0.2008 | 0.2278 | 20.0 |
1.0537 | 0.67 | 3000 | 0.9891 | 0.2418 | 0.1201 | 0.2014 | 0.2277 | 20.0 |
1.0643 | 0.78 | 3500 | 0.9895 | 0.2396 | 0.1194 | 0.1997 | 0.2259 | 20.0 |
1.0375 | 0.89 | 4000 | 0.9775 | 0.2434 | 0.122 | 0.2025 | 0.2293 | 20.0 |
1.013 | 1.0 | 4500 | 0.9728 | 0.244 | 0.1218 | 0.2029 | 0.2298 | 20.0 |
1.0247 | 1.11 | 5000 | 0.9705 | 0.243 | 0.1206 | 0.2019 | 0.2287 | 20.0 |
1.0374 | 1.23 | 5500 | 0.9642 | 0.2432 | 0.1217 | 0.2022 | 0.2292 | 20.0 |
1.0084 | 1.34 | 6000 | 0.9609 | 0.2437 | 0.1235 | 0.204 | 0.2299 | 20.0 |
1.0195 | 1.45 | 6500 | 0.9603 | 0.243 | 0.1221 | 0.2029 | 0.2291 | 20.0 |
0.9642 | 1.56 | 7000 | 0.9559 | 0.2438 | 0.1228 | 0.2035 | 0.2301 | 20.0 |
0.9903 | 1.67 | 7500 | 0.9540 | 0.243 | 0.1225 | 0.2029 | 0.2293 | 20.0 |
0.976 | 1.78 | 8000 | 0.9518 | 0.2434 | 0.1224 | 0.2025 | 0.2297 | 19.9997 |
1.0101 | 1.89 | 8500 | 0.9460 | 0.2452 | 0.1235 | 0.2042 | 0.231 | 20.0 |
0.9711 | 2.01 | 9000 | 0.9446 | 0.2431 | 0.1226 | 0.2032 | 0.2295 | 19.9995 |
0.9137 | 2.12 | 9500 | 0.9463 | 0.2459 | 0.1239 | 0.205 | 0.2318 | 20.0 |
0.9631 | 2.23 | 10000 | 0.9410 | 0.2451 | 0.1234 | 0.2043 | 0.2309 | 19.9999 |
0.9309 | 2.34 | 10500 | 0.9399 | 0.2446 | 0.1236 | 0.2042 | 0.2308 | 19.9991 |
0.9653 | 2.45 | 11000 | 0.9363 | 0.2444 | 0.1233 | 0.2039 | 0.2308 | 19.9999 |
0.9338 | 2.56 | 11500 | 0.9413 | 0.2439 | 0.1224 | 0.2028 | 0.2294 | 20.0 |
0.9373 | 2.67 | 12000 | 0.9334 | 0.245 | 0.1241 | 0.2047 | 0.2312 | 19.9996 |
0.9661 | 2.79 | 12500 | 0.9334 | 0.2456 | 0.1241 | 0.2051 | 0.2318 | 19.9999 |
0.9446 | 2.9 | 13000 | 0.9340 | 0.2447 | 0.1239 | 0.2045 | 0.2309 | 19.9999 |
0.9109 | 3.01 | 13500 | 0.9340 | 0.2445 | 0.1234 | 0.2041 | 0.2308 | 19.9999 |
0.8955 | 3.12 | 14000 | 0.9357 | 0.2459 | 0.1249 | 0.2055 | 0.2318 | 20.0 |
0.9163 | 3.23 | 14500 | 0.9319 | 0.2461 | 0.1239 | 0.205 | 0.2319 | 20.0 |
0.9059 | 3.34 | 15000 | 0.9320 | 0.2446 | 0.124 | 0.2044 | 0.2309 | 19.9997 |
0.8893 | 3.46 | 15500 | 0.9288 | 0.2462 | 0.1247 | 0.2053 | 0.2322 | 19.9999 |
0.8963 | 3.57 | 16000 | 0.9301 | 0.2441 | 0.124 | 0.2043 | 0.2306 | 20.0 |
0.8924 | 3.68 | 16500 | 0.9295 | 0.2431 | 0.1236 | 0.2038 | 0.2296 | 19.9997 |
0.8832 | 3.79 | 17000 | 0.9267 | 0.2457 | 0.1237 | 0.2049 | 0.2316 | 19.9999 |
0.8874 | 3.9 | 17500 | 0.9263 | 0.2458 | 0.125 | 0.2054 | 0.232 | 20.0 |
0.8464 | 4.01 | 18000 | 0.9272 | 0.2446 | 0.1234 | 0.2039 | 0.2305 | 20.0 |
0.8391 | 4.12 | 18500 | 0.9253 | 0.2453 | 0.1245 | 0.205 | 0.2313 | 20.0 |
0.8602 | 4.24 | 19000 | 0.9273 | 0.2464 | 0.1248 | 0.2055 | 0.2322 | 19.9997 |
0.8674 | 4.35 | 19500 | 0.9260 | 0.2449 | 0.1242 | 0.2047 | 0.2309 | 20.0 |
0.8634 | 4.46 | 20000 | 0.9261 | 0.2462 | 0.1248 | 0.2053 | 0.2322 | 20.0 |
0.8522 | 4.57 | 20500 | 0.9259 | 0.2456 | 0.1242 | 0.2052 | 0.2316 | 20.0 |
0.8532 | 4.68 | 21000 | 0.9256 | 0.2452 | 0.1242 | 0.2049 | 0.2315 | 20.0 |
0.8608 | 4.79 | 21500 | 0.9218 | 0.2446 | 0.1242 | 0.2049 | 0.2309 | 19.9997 |
0.8649 | 4.9 | 22000 | 0.9239 | 0.2461 | 0.1243 | 0.2047 | 0.2317 | 19.9997 |
0.8329 | 5.02 | 22500 | 0.9260 | 0.2456 | 0.1248 | 0.2052 | 0.2315 | 19.9999 |
0.8475 | 5.13 | 23000 | 0.9247 | 0.2449 | 0.1241 | 0.2045 | 0.2309 | 20.0 |
0.8595 | 5.24 | 23500 | 0.9246 | 0.2443 | 0.1239 | 0.2044 | 0.2306 | 20.0 |
0.8707 | 5.35 | 24000 | 0.9228 | 0.2458 | 0.1246 | 0.2054 | 0.2318 | 19.9997 |
0.8565 | 5.46 | 24500 | 0.9243 | 0.245 | 0.1241 | 0.2047 | 0.231 | 20.0 |
0.848 | 5.57 | 25000 | 0.9232 | 0.2464 | 0.1256 | 0.206 | 0.2324 | 20.0 |
0.8251 | 5.68 | 25500 | 0.9212 | 0.2465 | 0.1253 | 0.2057 | 0.2327 | 20.0 |
0.8352 | 5.8 | 26000 | 0.9203 | 0.245 | 0.1242 | 0.2043 | 0.2309 | 19.9996 |
0.837 | 5.91 | 26500 | 0.9178 | 0.2464 | 0.1247 | 0.2055 | 0.2321 | 19.9999 |
0.8233 | 6.02 | 27000 | 0.9204 | 0.2456 | 0.1247 | 0.2052 | 0.2318 | 20.0 |
0.8169 | 6.13 | 27500 | 0.9246 | 0.2454 | 0.1242 | 0.205 | 0.2314 | 20.0 |
0.8351 | 6.24 | 28000 | 0.9194 | 0.2453 | 0.1248 | 0.2052 | 0.2312 | 20.0 |
0.8275 | 6.35 | 28500 | 0.9221 | 0.2468 | 0.1255 | 0.2062 | 0.2329 | 19.9999 |
0.818 | 6.46 | 29000 | 0.9244 | 0.2456 | 0.1243 | 0.205 | 0.2316 | 20.0 |
0.8262 | 6.58 | 29500 | 0.9194 | 0.2471 | 0.1256 | 0.2064 | 0.233 | 20.0 |
0.8138 | 6.69 | 30000 | 0.9225 | 0.2469 | 0.1257 | 0.2062 | 0.233 | 20.0 |
0.8476 | 6.8 | 30500 | 0.9188 | 0.2467 | 0.1254 | 0.2059 | 0.2328 | 20.0 |
0.8376 | 6.91 | 31000 | 0.9216 | 0.2473 | 0.1255 | 0.2064 | 0.2331 | 20.0 |
0.7947 | 7.02 | 31500 | 0.9218 | 0.2471 | 0.1256 | 0.2061 | 0.2329 | 19.9999 |
0.7937 | 7.13 | 32000 | 0.9241 | 0.2465 | 0.1249 | 0.2057 | 0.2324 | 19.9996 |
0.8194 | 7.24 | 32500 | 0.9230 | 0.2471 | 0.1259 | 0.2063 | 0.2329 | 20.0 |
0.8122 | 7.36 | 33000 | 0.9204 | 0.2458 | 0.125 | 0.2055 | 0.232 | 19.9996 |
0.7676 | 7.47 | 33500 | 0.9232 | 0.2468 | 0.1253 | 0.206 | 0.2327 | 20.0 |
0.7772 | 7.58 | 34000 | 0.9226 | 0.2463 | 0.1251 | 0.2057 | 0.2323 | 20.0 |
0.809 | 7.69 | 34500 | 0.9197 | 0.2469 | 0.1255 | 0.2061 | 0.2329 | 19.9997 |
0.7839 | 7.8 | 35000 | 0.9205 | 0.2475 | 0.1261 | 0.2067 | 0.2334 | 19.9997 |
0.7936 | 7.91 | 35500 | 0.9186 | 0.2469 | 0.1254 | 0.2061 | 0.2327 | 19.9997 |
0.8108 | 8.02 | 36000 | 0.9215 | 0.2472 | 0.1253 | 0.206 | 0.2329 | 20.0 |
0.7987 | 8.14 | 36500 | 0.9219 | 0.2473 | 0.1254 | 0.2062 | 0.2331 | 19.9999 |
0.7881 | 8.25 | 37000 | 0.9213 | 0.2474 | 0.1253 | 0.206 | 0.233 | 20.0 |
0.8007 | 8.36 | 37500 | 0.9215 | 0.2474 | 0.1258 | 0.2064 | 0.2332 | 20.0 |
0.7789 | 8.47 | 38000 | 0.9226 | 0.2462 | 0.1252 | 0.2054 | 0.2321 | 20.0 |
0.8155 | 8.58 | 38500 | 0.9182 | 0.2465 | 0.1254 | 0.206 | 0.2325 | 19.9999 |
0.7863 | 8.69 | 39000 | 0.9187 | 0.2465 | 0.1252 | 0.2059 | 0.2323 | 19.9999 |
0.796 | 8.8 | 39500 | 0.9201 | 0.2469 | 0.1254 | 0.206 | 0.2327 | 19.9999 |
0.8003 | 8.92 | 40000 | 0.9197 | 0.2463 | 0.1252 | 0.2057 | 0.2323 | 20.0 |
0.803 | 9.03 | 40500 | 0.9206 | 0.2465 | 0.1253 | 0.2058 | 0.2323 | 19.9997 |
0.79 | 9.14 | 41000 | 0.9221 | 0.2467 | 0.1251 | 0.206 | 0.2326 | 19.9997 |
0.7605 | 9.25 | 41500 | 0.9211 | 0.247 | 0.1254 | 0.2059 | 0.2329 | 20.0 |
0.7543 | 9.36 | 42000 | 0.9214 | 0.2473 | 0.1258 | 0.2065 | 0.2333 | 19.9999 |
0.7959 | 9.47 | 42500 | 0.9203 | 0.2471 | 0.1255 | 0.2061 | 0.2332 | 19.9999 |
0.7826 | 9.58 | 43000 | 0.9205 | 0.2469 | 0.1256 | 0.206 | 0.2329 | 20.0 |
0.7835 | 9.7 | 43500 | 0.9198 | 0.2466 | 0.1252 | 0.2057 | 0.2326 | 20.0 |
0.7809 | 9.81 | 44000 | 0.9205 | 0.2469 | 0.1253 | 0.206 | 0.2328 | 20.0 |
0.7899 | 9.92 | 44500 | 0.9201 | 0.2472 | 0.1256 | 0.2063 | 0.2331 | 20.0 |
Framework versions
- Transformers 4.38.2
- Pytorch 2.0.0+cu117
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for baek26/bart-cnndm
Base model
facebook/bart-base