versae commited on
Commit
2528d68
1 Parent(s): e2ef14f

Step... (9000/50000 | Loss: 1.7518370151519775, Acc: 0.6520029902458191): 18%|█████▎ | 9175/50000 [3:31:42<14:26:35, 1.27s/it]

Browse files
Files changed (32) hide show
  1. flax_model.msgpack +1 -1
  2. outputs/checkpoints/checkpoint-2000/training_state.json +0 -1
  3. outputs/checkpoints/checkpoint-3000/training_state.json +0 -1
  4. outputs/checkpoints/checkpoint-4000/training_state.json +0 -1
  5. outputs/checkpoints/{checkpoint-2000 → checkpoint-7000}/config.json +0 -0
  6. outputs/checkpoints/{checkpoint-2000 → checkpoint-7000}/data_collator.joblib +0 -0
  7. outputs/checkpoints/{checkpoint-2000 → checkpoint-7000}/flax_model.msgpack +1 -1
  8. outputs/checkpoints/{checkpoint-4000 → checkpoint-7000}/optimizer_state.msgpack +1 -1
  9. outputs/checkpoints/{checkpoint-2000 → checkpoint-7000}/training_args.joblib +0 -0
  10. outputs/checkpoints/checkpoint-7000/training_state.json +1 -0
  11. outputs/checkpoints/{checkpoint-3000 → checkpoint-8000}/config.json +0 -0
  12. outputs/checkpoints/{checkpoint-3000 → checkpoint-8000}/data_collator.joblib +0 -0
  13. outputs/checkpoints/{checkpoint-4000 → checkpoint-8000}/flax_model.msgpack +1 -1
  14. outputs/checkpoints/{checkpoint-2000 → checkpoint-8000}/optimizer_state.msgpack +1 -1
  15. outputs/checkpoints/{checkpoint-3000 → checkpoint-8000}/training_args.joblib +0 -0
  16. outputs/checkpoints/checkpoint-8000/training_state.json +1 -0
  17. outputs/checkpoints/{checkpoint-4000 → checkpoint-9000}/config.json +0 -0
  18. outputs/checkpoints/{checkpoint-4000 → checkpoint-9000}/data_collator.joblib +0 -0
  19. outputs/checkpoints/{checkpoint-3000 → checkpoint-9000}/flax_model.msgpack +1 -1
  20. outputs/checkpoints/{checkpoint-3000 → checkpoint-9000}/optimizer_state.msgpack +1 -1
  21. outputs/checkpoints/{checkpoint-4000 → checkpoint-9000}/training_args.joblib +0 -0
  22. outputs/checkpoints/checkpoint-9000/training_state.json +1 -0
  23. outputs/events.out.tfevents.1627258355.tablespoon.3000110.3.v2 +2 -2
  24. outputs/flax_model.msgpack +1 -1
  25. outputs/optimizer_state.msgpack +1 -1
  26. outputs/training_state.json +1 -1
  27. pytorch_model.bin +1 -1
  28. run_stream.512.log +0 -0
  29. wandb/run-20210726_001233-17u6inbn/files/output.log +1727 -0
  30. wandb/run-20210726_001233-17u6inbn/files/wandb-summary.json +1 -1
  31. wandb/run-20210726_001233-17u6inbn/logs/debug-internal.log +0 -0
  32. wandb/run-20210726_001233-17u6inbn/run-17u6inbn.wandb +0 -0
flax_model.msgpack CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b54768633b7c94c65dc0adcd52791094c373de64dec9c2313b790651064030f8
3
  size 249750019
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:55484b434d505ef7284a42471c8326f9bebe13561d6cbe478c61990f9fd7a04d
3
  size 249750019
outputs/checkpoints/checkpoint-2000/training_state.json DELETED
@@ -1 +0,0 @@
1
- {"step": 2001}
 
 
outputs/checkpoints/checkpoint-3000/training_state.json DELETED
@@ -1 +0,0 @@
1
- {"step": 3001}
 
 
outputs/checkpoints/checkpoint-4000/training_state.json DELETED
@@ -1 +0,0 @@
1
- {"step": 4001}
 
 
outputs/checkpoints/{checkpoint-2000 → checkpoint-7000}/config.json RENAMED
File without changes
outputs/checkpoints/{checkpoint-2000 → checkpoint-7000}/data_collator.joblib RENAMED
File without changes
outputs/checkpoints/{checkpoint-2000 → checkpoint-7000}/flax_model.msgpack RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0c1f1112c5a4f38297c063f72b4595e44da861d8a37002bfef8f6a7b8a2db074
3
  size 249750019
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:353e62a7bbf3b5817b869c37e749c8e30fe14477d32a3cf95345a030057ed760
3
  size 249750019
outputs/checkpoints/{checkpoint-4000 → checkpoint-7000}/optimizer_state.msgpack RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9e7ff0681be4448d2a38ba748fadb3bb0e2972be224603c18f86f5c8d4f003cd
3
  size 499500278
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0cd67c6ccf30e42fa238a68d1aa1ae063e8e11fc6c50bf034163444ab3f91118
3
  size 499500278
outputs/checkpoints/{checkpoint-2000 → checkpoint-7000}/training_args.joblib RENAMED
File without changes
outputs/checkpoints/checkpoint-7000/training_state.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"step": 7001}
outputs/checkpoints/{checkpoint-3000 → checkpoint-8000}/config.json RENAMED
File without changes
outputs/checkpoints/{checkpoint-3000 → checkpoint-8000}/data_collator.joblib RENAMED
File without changes
outputs/checkpoints/{checkpoint-4000 → checkpoint-8000}/flax_model.msgpack RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4feb05268279af0817ec9cdbce949fd0a1aedac0c5d72c727a4bddd1667016a7
3
  size 249750019
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fcd1e001a114c411bab4cde0ffdf4e4bc13e918b2c1c3cac7a75100e5a3f0349
3
  size 249750019
outputs/checkpoints/{checkpoint-2000 → checkpoint-8000}/optimizer_state.msgpack RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5336f60347962af6217ec50bb816aa2b7a3225125f16bc4da276394b3b7eac9e
3
  size 499500278
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7070a9b0eb3c596cc8b7f538faa458611e2d751b69600e272ec31b7c5c1bbc82
3
  size 499500278
outputs/checkpoints/{checkpoint-3000 → checkpoint-8000}/training_args.joblib RENAMED
File without changes
outputs/checkpoints/checkpoint-8000/training_state.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"step": 8001}
outputs/checkpoints/{checkpoint-4000 → checkpoint-9000}/config.json RENAMED
File without changes
outputs/checkpoints/{checkpoint-4000 → checkpoint-9000}/data_collator.joblib RENAMED
File without changes
outputs/checkpoints/{checkpoint-3000 → checkpoint-9000}/flax_model.msgpack RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e6c49ca76d9eb4fcaedf18790304282801372115ea7b8926caed70d3f347c9eb
3
  size 249750019
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:55484b434d505ef7284a42471c8326f9bebe13561d6cbe478c61990f9fd7a04d
3
  size 249750019
outputs/checkpoints/{checkpoint-3000 → checkpoint-9000}/optimizer_state.msgpack RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c4b4dd23e9ef7da6242bfe8be61c752f257574633b2f31de4d3a3612ea946453
3
  size 499500278
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2085e2cdeca180d85963536b92e396dad244a1a40804023af28d868e886658c8
3
  size 499500278
outputs/checkpoints/{checkpoint-4000 → checkpoint-9000}/training_args.joblib RENAMED
File without changes
outputs/checkpoints/checkpoint-9000/training_state.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"step": 9001}
outputs/events.out.tfevents.1627258355.tablespoon.3000110.3.v2 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f97666571180797f8a968d48c3763a8cbeb077e610e79e3950db38d6c6a96a2f
3
- size 957164
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:59ffcc0c842038889bc43ba9ce06f442be97edfc66757b41c7b3292ee06bd1b0
3
+ size 1325429
outputs/flax_model.msgpack CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b54768633b7c94c65dc0adcd52791094c373de64dec9c2313b790651064030f8
3
  size 249750019
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:55484b434d505ef7284a42471c8326f9bebe13561d6cbe478c61990f9fd7a04d
3
  size 249750019
outputs/optimizer_state.msgpack CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:db32f6c60858ce8df9fec4b0aab38feff295deb2af84540eced3d8fc2c9e95bc
3
  size 499500278
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2085e2cdeca180d85963536b92e396dad244a1a40804023af28d868e886658c8
3
  size 499500278
outputs/training_state.json CHANGED
@@ -1 +1 @@
1
- {"step": 6001}
 
1
+ {"step": 9001}
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:576a3a8b19cb7a56e505fbb15f9773d81f46cc98eef8577312b45d8af16f6155
3
  size 498858859
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5319ca4762633df47ea8467b2f87c5c43f499ace80a7f0bc7dd075f31d1405fd
3
  size 498858859
run_stream.512.log CHANGED
The diff for this file is too large to render. See raw diff
 
wandb/run-20210726_001233-17u6inbn/files/output.log CHANGED
@@ -4268,6 +4268,1733 @@ You should probably TRAIN this model on a down-stream task to be able to use it
4268
 
4269
 
4270
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4271
 
4272
 
4273
 
 
4268
 
4269
 
4270
 
4271
+
4272
+
4273
+
4274
+
4275
+
4276
+
4277
+
4278
+
4279
+
4280
+
4281
+
4282
+
4283
+
4284
+
4285
+
4286
+
4287
+
4288
+
4289
+
4290
+
4291
+
4292
+
4293
+
4294
+
4295
+
4296
+
4297
+
4298
+
4299
+
4300
+
4301
+
4302
+
4303
+
4304
+
4305
+
4306
+
4307
+
4308
+
4309
+
4310
+
4311
+
4312
+
4313
+
4314
+
4315
+
4316
+
4317
+
4318
+
4319
+
4320
+
4321
+
4322
+
4323
+
4324
+
4325
+
4326
+
4327
+
4328
+
4329
+
4330
+
4331
+
4332
+
4333
+
4334
+
4335
+
4336
+
4337
+
4338
+
4339
+
4340
+
4341
+
4342
+
4343
+
4344
+
4345
+
4346
+
4347
+
4348
+
4349
+
4350
+
4351
+
4352
+
4353
+
4354
+
4355
+
4356
+
4357
+
4358
+
4359
+
4360
+
4361
+
4362
+
4363
+
4364
+
4365
+
4366
+
4367
+
4368
+
4369
+
4370
+
4371
+
4372
+
4373
+
4374
+
4375
+
4376
+
4377
+
4378
+
4379
+
4380
+
4381
+
4382
+
4383
+
4384
+
4385
+
4386
+
4387
+
4388
+
4389
+
4390
+
4391
+
4392
+
4393
+
4394
+
4395
+
4396
+
4397
+
4398
+
4399
+
4400
+
4401
+
4402
+
4403
+
4404
+
4405
+
4406
+
4407
+
4408
+
4409
+
4410
+
4411
+
4412
+
4413
+
4414
+
4415
+
4416
+
4417
+
4418
+
4419
+
4420
+
4421
+
4422
+
4423
+
4424
+
4425
+
4426
+
4427
+
4428
+
4429
+
4430
+
4431
+
4432
+
4433
+
4434
+
4435
+
4436
+
4437
+
4438
+
4439
+
4440
+
4441
+
4442
+
4443
+
4444
+
4445
+
4446
+
4447
+
4448
+
4449
+
4450
+
4451
+
4452
+
4453
+
4454
+
4455
+
4456
+
4457
+
4458
+
4459
+
4460
+
4461
+
4462
+
4463
+
4464
+
4465
+
4466
+
4467
+
4468
+
4469
+
4470
+
4471
+
4472
+
4473
+
4474
+
4475
+
4476
+
4477
+
4478
+
4479
+
4480
+
4481
+
4482
+
4483
+
4484
+
4485
+
4486
+
4487
+
4488
+
4489
+
4490
+
4491
+
4492
+
4493
+
4494
+
4495
+
4496
+
4497
+
4498
+
4499
+
4500
+
4501
+
4502
+
4503
+
4504
+
4505
+
4506
+
4507
+
4508
+
4509
+
4510
+
4511
+
4512
+
4513
+
4514
+
4515
+
4516
+
4517
+
4518
+
4519
+
4520
+
4521
+
4522
+
4523
+
4524
+
4525
+
4526
+
4527
+
4528
+
4529
+
4530
+
4531
+
4532
+
4533
+
4534
+
4535
+
4536
+
4537
+
4538
+
4539
+
4540
+
4541
+
4542
+
4543
+
4544
+
4545
+
4546
+
4547
+
4548
+
4549
+
4550
+
4551
+
4552
+
4553
+
4554
+
4555
+
4556
+
4557
+
4558
+
4559
+
4560
+
4561
+
4562
+
4563
+
4564
+
4565
+
4566
+
4567
+
4568
+
4569
+
4570
+
4571
+
4572
+
4573
+
4574
+
4575
+
4576
+
4577
+
4578
+
4579
+
4580
+ Step... (6000/50000 | Loss: 1.7780379056930542, Acc: 0.6486639976501465): 14%|████ | 7000/50000 [2:40:57<15:48:42, 1.32s/it]
4581
+ Evaluating ...: 0%| | 0/130 [00:00<?, ?it/s]
4582
+ Step... (6500 | Loss: 1.835520625114441, Learning Rate: 0.0005272727576084435)
4583
+
4584
+
4585
+
4586
+
4587
+
4588
+
4589
+
4590
+
4591
+
4592
+
4593
+
4594
+
4595
+ [04:49:21] - INFO - __main__ - Saving checkpoint at 7000 steps██████████████████████████████████████████████████████| 130/130 [00:21<00:00, 4.59it/s]
4596
+ All Flax model weights were used when initializing RobertaForMaskedLM.
4597
+ Some weights of RobertaForMaskedLM were not initialized from the Flax model and are newly initialized: ['lm_head.decoder.weight', 'roberta.embeddings.position_ids', 'lm_head.decoder.bias']
4598
+ You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
4599
+
4600
+
4601
+
4602
+
4603
+
4604
+
4605
+
4606
+
4607
+
4608
+
4609
+
4610
+
4611
+
4612
+
4613
+
4614
+
4615
+
4616
+
4617
+
4618
+
4619
+
4620
+
4621
+
4622
+
4623
+
4624
+
4625
+
4626
+
4627
+
4628
+
4629
+
4630
+
4631
+
4632
+
4633
+
4634
+
4635
+
4636
+
4637
+
4638
+
4639
+
4640
+
4641
+
4642
+
4643
+
4644
+
4645
+
4646
+
4647
+
4648
+
4649
+
4650
+
4651
+
4652
+
4653
+
4654
+
4655
+
4656
+
4657
+
4658
+
4659
+
4660
+
4661
+
4662
+
4663
+
4664
+
4665
+
4666
+
4667
+
4668
+
4669
+
4670
+
4671
+
4672
+
4673
+
4674
+
4675
+
4676
+
4677
+
4678
+
4679
+
4680
+
4681
+
4682
+
4683
+
4684
+
4685
+
4686
+
4687
+
4688
+
4689
+
4690
+
4691
+
4692
+
4693
+
4694
+
4695
+
4696
+
4697
+
4698
+
4699
+
4700
+
4701
+
4702
+
4703
+
4704
+
4705
+
4706
+
4707
+
4708
+
4709
+
4710
+
4711
+
4712
+
4713
+
4714
+
4715
+
4716
+
4717
+
4718
+
4719
+
4720
+
4721
+
4722
+
4723
+
4724
+
4725
+
4726
+
4727
+
4728
+
4729
+
4730
+
4731
+
4732
+
4733
+
4734
+
4735
+
4736
+
4737
+
4738
+
4739
+
4740
+
4741
+
4742
+
4743
+
4744
+
4745
+
4746
+
4747
+
4748
+
4749
+
4750
+
4751
+
4752
+
4753
+
4754
+
4755
+
4756
+
4757
+
4758
+
4759
+
4760
+
4761
+
4762
+
4763
+
4764
+
4765
+
4766
+
4767
+
4768
+
4769
+
4770
+
4771
+
4772
+
4773
+
4774
+
4775
+
4776
+
4777
+
4778
+
4779
+
4780
+
4781
+
4782
+
4783
+
4784
+
4785
+
4786
+
4787
+
4788
+
4789
+
4790
+
4791
+
4792
+
4793
+
4794
+
4795
+
4796
+
4797
+
4798
+
4799
+
4800
+
4801
+
4802
+
4803
+
4804
+
4805
+
4806
+
4807
+
4808
+
4809
+
4810
+
4811
+
4812
+
4813
+
4814
+
4815
+
4816
+
4817
+
4818
+
4819
+
4820
+
4821
+
4822
+
4823
+
4824
+
4825
+
4826
+
4827
+
4828
+
4829
+
4830
+
4831
+
4832
+
4833
+
4834
+
4835
+
4836
+
4837
+
4838
+
4839
+
4840
+
4841
+
4842
+
4843
+
4844
+
4845
+
4846
+
4847
+
4848
+
4849
+
4850
+
4851
+
4852
+
4853
+
4854
+
4855
+
4856
+
4857
+
4858
+
4859
+
4860
+
4861
+
4862
+
4863
+
4864
+
4865
+
4866
+
4867
+
4868
+
4869
+
4870
+
4871
+
4872
+
4873
+
4874
+
4875
+
4876
+
4877
+
4878
+
4879
+
4880
+
4881
+
4882
+
4883
+
4884
+
4885
+
4886
+
4887
+
4888
+
4889
+
4890
+
4891
+
4892
+
4893
+
4894
+
4895
+
4896
+
4897
+
4898
+
4899
+
4900
+
4901
+
4902
+
4903
+
4904
+
4905
+
4906
+
4907
+
4908
+
4909
+
4910
+
4911
+
4912
+
4913
+
4914
+
4915
+
4916
+
4917
+
4918
+
4919
+
4920
+
4921
+
4922
+
4923
+
4924
+
4925
+
4926
+
4927
+
4928
+
4929
+
4930
+
4931
+
4932
+
4933
+
4934
+
4935
+
4936
+
4937
+
4938
+
4939
+
4940
+
4941
+
4942
+
4943
+
4944
+
4945
+
4946
+
4947
+
4948
+
4949
+
4950
+
4951
+
4952
+
4953
+
4954
+
4955
+
4956
+
4957
+
4958
+
4959
+
4960
+
4961
+
4962
+
4963
+
4964
+
4965
+
4966
+
4967
+
4968
+
4969
+
4970
+
4971
+
4972
+
4973
+
4974
+
4975
+
4976
+
4977
+
4978
+
4979
+
4980
+
4981
+
4982
+
4983
+
4984
+
4985
+
4986
+
4987
+
4988
+
4989
+
4990
+
4991
+
4992
+
4993
+
4994
+
4995
+
4996
+
4997
+
4998
+
4999
+
5000
+
5001
+
5002
+
5003
+
5004
+
5005
+
5006
+
5007
+
5008
+
5009
+
5010
+
5011
+
5012
+
5013
+
5014
+
5015
+
5016
+
5017
+
5018
+
5019
+
5020
+
5021
+
5022
+
5023
+
5024
+
5025
+
5026
+
5027
+
5028
+
5029
+
5030
+
5031
+
5032
+
5033
+
5034
+
5035
+
5036
+
5037
+
5038
+
5039
+
5040
+
5041
+
5042
+
5043
+
5044
+
5045
+
5046
+
5047
+
5048
+
5049
+
5050
+
5051
+
5052
+
5053
+
5054
+
5055
+
5056
+
5057
+
5058
+
5059
+
5060
+
5061
+
5062
+
5063
+
5064
+
5065
+
5066
+
5067
+
5068
+
5069
+
5070
+
5071
+
5072
+
5073
+
5074
+
5075
+
5076
+
5077
+
5078
+
5079
+
5080
+
5081
+
5082
+
5083
+
5084
+
5085
+
5086
+
5087
+
5088
+
5089
+
5090
+
5091
+
5092
+
5093
+
5094
+
5095
+
5096
+
5097
+
5098
+
5099
+
5100
+
5101
+
5102
+
5103
+
5104
+
5105
+
5106
+
5107
+
5108
+
5109
+
5110
+
5111
+
5112
+
5113
+
5114
+
5115
+
5116
+
5117
+
5118
+
5119
+
5120
+
5121
+
5122
+
5123
+
5124
+
5125
+
5126
+
5127
+
5128
+
5129
+
5130
+
5131
+
5132
+
5133
+
5134
+
5135
+
5136
+
5137
+
5138
+
5139
+
5140
+
5141
+
5142
+
5143
+
5144
+
5145
+
5146
+
5147
+
5148
+
5149
+
5150
+
5151
+
5152
+
5153
+
5154
+
5155
+
5156
+
5157
+
5158
+
5159
+
5160
+
5161
+
5162
+
5163
+
5164
+
5165
+
5166
+
5167
+
5168
+
5169
+
5170
+
5171
+
5172
+
5173
+
5174
+
5175
+
5176
+
5177
+
5178
+
5179
+
5180
+
5181
+
5182
+
5183
+
5184
+
5185
+
5186
+
5187
+
5188
+
5189
+
5190
+
5191
+
5192
+
5193
+
5194
+
5195
+
5196
+
5197
+
5198
+
5199
+
5200
+
5201
+
5202
+
5203
+
5204
+
5205
+
5206
+
5207
+
5208
+
5209
+
5210
+
5211
+
5212
+
5213
+
5214
+
5215
+
5216
+
5217
+
5218
+
5219
+
5220
+
5221
+
5222
+
5223
+
5224
+
5225
+
5226
+
5227
+
5228
+
5229
+
5230
+
5231
+
5232
+
5233
+
5234
+
5235
+
5236
+
5237
+
5238
+
5239
+
5240
+ Step... (7000/50000 | Loss: 1.767648458480835, Acc: 0.6495990753173828): 16%|████▊ | 8000/50000 [3:04:03<15:18:38, 1.31s/it]
5241
+ Step... (7500 | Loss: 1.8483006954193115, Learning Rate: 0.0005151515360921621)
5242
+ Step... (7000/50000 | Loss: 1.767648458480835, Acc: 0.6495990753173828): 16%|████▊ | 8000/50000 [3:04:04<15:18:38, 1.31s/it]
5243
+
5244
+
5245
+
5246
+
5247
+
5248
+
5249
+
5250
+
5251
+
5252
+
5253
+
5254
+
5255
+ [05:12:28] - INFO - __main__ - Saving checkpoint at 8000 steps██████████████████████████████████████████████████████| 130/130 [00:21<00:00, 4.59it/s]
5256
+ All Flax model weights were used when initializing RobertaForMaskedLM.
5257
+ Some weights of RobertaForMaskedLM were not initialized from the Flax model and are newly initialized: ['lm_head.decoder.weight', 'roberta.embeddings.position_ids', 'lm_head.decoder.bias']
5258
+ You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
5259
+
5260
+
5261
+
5262
+
5263
+
5264
+
5265
+
5266
+
5267
+
5268
+
5269
+
5270
+
5271
+
5272
+
5273
+
5274
+
5275
+
5276
+
5277
+
5278
+
5279
+
5280
+
5281
+
5282
+
5283
+
5284
+
5285
+
5286
+
5287
+
5288
+
5289
+
5290
+
5291
+
5292
+
5293
+
5294
+
5295
+
5296
+
5297
+
5298
+
5299
+
5300
+
5301
+
5302
+
5303
+
5304
+
5305
+
5306
+
5307
+
5308
+
5309
+
5310
+
5311
+
5312
+
5313
+
5314
+
5315
+
5316
+
5317
+
5318
+
5319
+
5320
+
5321
+
5322
+
5323
+
5324
+
5325
+
5326
+
5327
+
5328
+
5329
+
5330
+
5331
+
5332
+
5333
+
5334
+
5335
+
5336
+
5337
+
5338
+
5339
+
5340
+
5341
+
5342
+
5343
+
5344
+
5345
+
5346
+
5347
+
5348
+
5349
+
5350
+
5351
+
5352
+
5353
+
5354
+
5355
+
5356
+
5357
+
5358
+
5359
+
5360
+
5361
+
5362
+
5363
+
5364
+
5365
+
5366
+
5367
+
5368
+
5369
+
5370
+
5371
+
5372
+
5373
+
5374
+
5375
+
5376
+
5377
+
5378
+
5379
+
5380
+
5381
+
5382
+
5383
+
5384
+
5385
+
5386
+
5387
+
5388
+
5389
+
5390
+
5391
+
5392
+
5393
+
5394
+
5395
+
5396
+
5397
+
5398
+
5399
+
5400
+
5401
+
5402
+
5403
+
5404
+
5405
+
5406
+
5407
+
5408
+
5409
+
5410
+
5411
+
5412
+
5413
+
5414
+
5415
+
5416
+
5417
+
5418
+
5419
+
5420
+
5421
+
5422
+
5423
+
5424
+
5425
+
5426
+
5427
+
5428
+
5429
+
5430
+
5431
+
5432
+
5433
+
5434
+
5435
+
5436
+
5437
+
5438
+
5439
+
5440
+
5441
+
5442
+
5443
+
5444
+
5445
+
5446
+
5447
+
5448
+
5449
+
5450
+
5451
+
5452
+
5453
+
5454
+
5455
+
5456
+
5457
+
5458
+
5459
+
5460
+
5461
+
5462
+
5463
+
5464
+
5465
+
5466
+
5467
+
5468
+
5469
+
5470
+
5471
+
5472
+
5473
+
5474
+
5475
+
5476
+
5477
+
5478
+
5479
+
5480
+
5481
+
5482
+
5483
+
5484
+
5485
+
5486
+
5487
+
5488
+
5489
+
5490
+
5491
+
5492
+
5493
+
5494
+
5495
+
5496
+
5497
+
5498
+
5499
+
5500
+
5501
+
5502
+
5503
+
5504
+
5505
+
5506
+
5507
+
5508
+
5509
+
5510
+
5511
+
5512
+
5513
+
5514
+
5515
+
5516
+
5517
+
5518
+
5519
+
5520
+
5521
+
5522
+
5523
+
5524
+
5525
+
5526
+
5527
+
5528
+
5529
+
5530
+
5531
+
5532
+
5533
+
5534
+
5535
+
5536
+
5537
+
5538
+
5539
+
5540
+
5541
+
5542
+
5543
+
5544
+
5545
+
5546
+
5547
+
5548
+
5549
+
5550
+
5551
+
5552
+
5553
+
5554
+
5555
+
5556
+
5557
+
5558
+
5559
+
5560
+
5561
+
5562
+
5563
+
5564
+
5565
+
5566
+
5567
+
5568
+
5569
+
5570
+
5571
+
5572
+
5573
+
5574
+
5575
+
5576
+
5577
+
5578
+
5579
+
5580
+
5581
+
5582
+
5583
+
5584
+
5585
+
5586
+
5587
+
5588
+
5589
+
5590
+
5591
+
5592
+
5593
+
5594
+
5595
+
5596
+
5597
+
5598
+
5599
+
5600
+
5601
+
5602
+
5603
+
5604
+
5605
+
5606
+
5607
+
5608
+
5609
+
5610
+
5611
+
5612
+
5613
+
5614
+
5615
+
5616
+
5617
+
5618
+
5619
+
5620
+
5621
+
5622
+
5623
+
5624
+
5625
+
5626
+
5627
+
5628
+
5629
+
5630
+
5631
+
5632
+
5633
+
5634
+
5635
+
5636
+
5637
+
5638
+
5639
+
5640
+
5641
+
5642
+
5643
+
5644
+
5645
+
5646
+
5647
+
5648
+
5649
+
5650
+
5651
+
5652
+
5653
+
5654
+
5655
+
5656
+
5657
+
5658
+
5659
+
5660
+
5661
+
5662
+
5663
+
5664
+
5665
+
5666
+
5667
+
5668
+
5669
+
5670
+
5671
+
5672
+
5673
+
5674
+
5675
+
5676
+
5677
+
5678
+
5679
+
5680
+
5681
+
5682
+
5683
+
5684
+
5685
+
5686
+
5687
+
5688
+
5689
+
5690
+
5691
+
5692
+
5693
+
5694
+
5695
+
5696
+
5697
+
5698
+
5699
+
5700
+
5701
+
5702
+
5703
+
5704
+
5705
+
5706
+
5707
+
5708
+
5709
+
5710
+
5711
+
5712
+
5713
+
5714
+
5715
+
5716
+
5717
+
5718
+
5719
+
5720
+
5721
+
5722
+
5723
+
5724
+
5725
+
5726
+
5727
+
5728
+
5729
+
5730
+
5731
+
5732
+
5733
+
5734
+
5735
+
5736
+
5737
+
5738
+
5739
+
5740
+
5741
+
5742
+
5743
+
5744
+
5745
+
5746
+
5747
+
5748
+
5749
+
5750
+
5751
+
5752
+
5753
+
5754
+
5755
+
5756
+
5757
+
5758
+
5759
+
5760
+
5761
+
5762
+
5763
+
5764
+
5765
+
5766
+
5767
+
5768
+
5769
+
5770
+
5771
+
5772
+
5773
+
5774
+
5775
+
5776
+
5777
+
5778
+
5779
+
5780
+
5781
+
5782
+
5783
+
5784
+
5785
+
5786
+
5787
+
5788
+
5789
+
5790
+
5791
+
5792
+
5793
+
5794
+
5795
+
5796
+
5797
+
5798
+
5799
+
5800
+
5801
+
5802
+
5803
+
5804
+
5805
+
5806
+
5807
+
5808
+
5809
+
5810
+
5811
+
5812
+
5813
+
5814
+
5815
+
5816
+
5817
+
5818
+
5819
+
5820
+
5821
+
5822
+
5823
+
5824
+
5825
+
5826
+
5827
+
5828
+
5829
+
5830
+
5831
+
5832
+
5833
+
5834
+
5835
+
5836
+
5837
+
5838
+
5839
+
5840
+
5841
+
5842
+
5843
+
5844
+
5845
+
5846
+
5847
+
5848
+
5849
+
5850
+
5851
+
5852
+
5853
+
5854
+
5855
+
5856
+
5857
+
5858
+
5859
+
5860
+
5861
+
5862
+
5863
+
5864
+
5865
+
5866
+
5867
+
5868
+
5869
+
5870
+
5871
+
5872
+
5873
+
5874
+
5875
+
5876
+
5877
+
5878
+
5879
+
5880
+
5881
+
5882
+
5883
+
5884
+
5885
+
5886
+
5887
+
5888
+
5889
+
5890
+
5891
+
5892
+
5893
+
5894
+
5895
+
5896
+ Step... (8000/50000 | Loss: 1.7662373781204224, Acc: 0.6503182649612427): 18%|█████▏ | 9000/50000 [3:26:55<14:07:58, 1.24s/it]
5897
+ Step... (8500 | Loss: 1.8929920196533203, Learning Rate: 0.0005030303145758808)
5898
+ Step... (9000 | Loss: 1.841712236404419, Learning Rate: 0.0004969697329215705)
5899
+
5900
+
5901
+
5902
+
5903
+
5904
+
5905
+
5906
+
5907
+
5908
+
5909
+
5910
+ [05:35:18] - INFO - __main__ - Saving checkpoint at 9000 steps██████████████████████████████████████████████████████| 130/130 [00:21<00:00, 4.60it/s]
5911
+ All Flax model weights were used when initializing RobertaForMaskedLM.
5912
+ Some weights of RobertaForMaskedLM were not initialized from the Flax model and are newly initialized: ['lm_head.decoder.weight', 'roberta.embeddings.position_ids', 'lm_head.decoder.bias']
5913
+ You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
5914
+
5915
+
5916
+
5917
+
5918
+
5919
+
5920
+
5921
+
5922
+
5923
+
5924
+
5925
+
5926
+
5927
+
5928
+
5929
+
5930
+
5931
+
5932
+
5933
+
5934
+
5935
+
5936
+
5937
+
5938
+
5939
+
5940
+
5941
+
5942
+
5943
+
5944
+
5945
+
5946
+
5947
+
5948
+
5949
+
5950
+
5951
+
5952
+
5953
+
5954
+
5955
+
5956
+
5957
+
5958
+
5959
+
5960
+
5961
+
5962
+
5963
+
5964
+
5965
+
5966
+
5967
+
5968
+
5969
+
5970
+
5971
+
5972
+
5973
+
5974
+
5975
+
5976
+
5977
+
5978
+
5979
+
5980
+
5981
+
5982
+
5983
+
5984
+
5985
+
5986
+
5987
+
5988
+
5989
+
5990
+
5991
+
5992
+
5993
+
5994
+
5995
+
5996
+
5997
+
5998
 
5999
 
6000
 
wandb/run-20210726_001233-17u6inbn/files/wandb-summary.json CHANGED
@@ -1 +1 @@
1
- {"global_step": 6500, "_timestamp": 1627274252.003003, "train_time": 152312.671875, "train_learning_rate": 0.0005272727576084435, "_step": 12961, "train_loss": 1.897655963897705, "eval_accuracy": 0.6486639976501465, "eval_loss": 1.7780379056930542}
 
1
+ {"global_step": 9000, "_timestamp": 1627277689.819425, "train_time": 242086.03125, "train_learning_rate": 0.0004969697329215705, "_step": 17946, "train_loss": 1.800872564315796, "eval_accuracy": 0.6503182649612427, "eval_loss": 1.7662373781204224}
wandb/run-20210726_001233-17u6inbn/logs/debug-internal.log CHANGED
The diff for this file is too large to render. See raw diff
 
wandb/run-20210726_001233-17u6inbn/run-17u6inbn.wandb CHANGED
Binary files a/wandb/run-20210726_001233-17u6inbn/run-17u6inbn.wandb and b/wandb/run-20210726_001233-17u6inbn/run-17u6inbn.wandb differ