artemis13fowl/bert-base-uncased-imdb

bert-base-uncased finetuned on IMDB dataset

Evaluation set was created by taking 1000 samples from test set

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 25000
    })
    dev: Dataset({
        features: ['text', 'label'],
        num_rows: 1000
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 24000
    })
})

Parameters

max_sequence_length = 128
batch_size = 32
eval_steps = 100
learning_rate=2e-05
num_train_epochs=5
early_stopping_patience = 10

Training Run

 [2700/3910 1:11:43 < 32:09, 0.63 it/s, Epoch 3/5]
Step 	Training Loss 	Validation Loss 	Accuracy 	Precision 	Recall 	F1 	Runtime 	Samples Per Second
100 	No log 	0.371974 	0.845000 	0.798942 	0.917004 	0.853911 	15.256900 	65.544000
200 	No log 	0.349631 	0.850000 	0.873913 	0.813765 	0.842767 	15.288600 	65.408000
300 	No log 	0.359376 	0.845000 	0.869281 	0.807692 	0.837356 	15.303900 	65.343000
400 	No log 	0.307613 	0.870000 	0.851351 	0.892713 	0.871542 	15.358400 	65.111000
500 	0.364500 	0.309362 	0.856000 	0.807018 	0.931174 	0.864662 	15.326100 	65.248000
600 	0.364500 	0.302709 	0.867000 	0.881607 	0.844130 	0.862461 	15.324400 	65.255000
700 	0.364500 	0.300102 	0.871000 	0.894168 	0.838057 	0.865204 	15.474900 	64.621000
800 	0.364500 	0.383784 	0.866000 	0.833333 	0.910931 	0.870406 	15.380100 	65.019000
900 	0.364500 	0.309934 	0.874000 	0.881743 	0.860324 	0.870902 	15.358900 	65.109000
1000 	0.254600 	0.332236 	0.872000 	0.894397 	0.840081 	0.866388 	15.442700 	64.756000
1100 	0.254600 	0.330807 	0.871000 	0.877847 	0.858300 	0.867963 	15.410900 	64.889000
1200 	0.254600 	0.352724 	0.872000 	0.925581 	0.805668 	0.861472 	15.272800 	65.476000
1300 	0.254600 	0.278529 	0.881000 	0.891441 	0.864372 	0.877698 	15.408200 	64.900000
1400 	0.254600 	0.291371 	0.878000 	0.854962 	0.906883 	0.880157 	15.427400 	64.820000
1500 	0.208400 	0.324827 	0.869000 	0.904232 	0.821862 	0.861082 	15.338600 	65.195000
1600 	0.208400 	0.377024 	0.884000 	0.898734 	0.862348 	0.880165 	15.414500 	64.874000
1700 	0.208400 	0.375274 	0.885000 	0.881288 	0.886640 	0.883956 	15.367200 	65.073000
1800 	0.208400 	0.378904 	0.880000 	0.877016 	0.880567 	0.878788 	15.363900 	65.088000
1900 	0.208400 	0.410517 	0.874000 	0.866534 	0.880567 	0.873494 	15.324700 	65.254000
2000 	0.130800 	0.404030 	0.876000 	0.888655 	0.856275 	0.872165 	15.414200 	64.875000
2100 	0.130800 	0.390763 	0.883000 	0.882353 	0.880567 	0.881459 	15.341500 	65.183000
2200 	0.130800 	0.417967 	0.880000 	0.875502 	0.882591 	0.879032 	15.351300 	65.141000
2300 	0.130800 	0.390974 	0.883000 	0.898520 	0.860324 	0.879007 	15.396100 	64.952000
2400 	0.130800 	0.479739 	0.874000 	0.856589 	0.894737 	0.875248 	15.460500 	64.681000
2500 	0.098400 	0.473215 	0.875000 	0.883576 	0.860324 	0.871795 	15.392200 	64.968000
2600 	0.098400 	0.532294 	0.872000 	0.889362 	0.846154 	0.867220 	15.364100 	65.087000
2700 	0.098400 	0.536664 	0.881000 	0.880325 	0.878543 	0.879433 	15.351100 	65.142000

TrainOutput(global_step=2700, training_loss=0.2004435383832013, metrics={'train_runtime': 4304.5331, 'train_samples_per_second': 0.908, 'total_flos': 7258763970957312, 'epoch': 3.45})

Classification Report

            precision    recall  f1-score   support

           0       0.90      0.87      0.89     11994
           1       0.87      0.90      0.89     12006

    accuracy                           0.89     24000
   macro avg       0.89      0.89      0.89     24000
weighted avg       0.89      0.89      0.89     24000