2024-09-09 11:53:51.396276: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 2024-09-09 11:53:51.414891: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-09-09 11:53:51.436268: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-09-09 11:53:51.442683: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-09-09 11:53:51.458047: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-09-09 11:53:52.683988: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT /usr/local/lib/python3.10/dist-packages/transformers/training_args.py:1525: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use `eval_strategy` instead warnings.warn( 09/09/2024 11:53:54 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 09/09/2024 11:53:54 - INFO - __main__ - Training/evaluation parameters TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, batch_eval_metrics=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=True, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=None, eval_strategy=epoch, eval_use_gather_object=False, evaluation_strategy=epoch, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=2, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=True, group_by_length=False, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=5e-05, length_column_name=length, load_best_model_at_end=True, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/content/dissertation/scripts/ner/output/tb, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=500, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=f1, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/content/dissertation/scripts/ner/output, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=8, per_device_train_batch_size=32, prediction_loss_only=False, push_to_hub=True, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/content/dissertation/scripts/ner/output, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=500, save_strategy=epoch, save_total_limit=None, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, ) Downloading builder script: 0%| | 0.00/3.91k [00:00> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json [INFO|configuration_utils.py:800] 2024-09-09 11:54:06,991 >> Model config RobertaConfig { "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es", "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "classifier_dropout": null, "eos_token_id": 2, "finetuning_task": "ner", "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "id2label": { "0": "O", "1": "B-SINTOMA", "2": "I-SINTOMA" }, "initializer_range": 0.02, "intermediate_size": 3072, "label2id": { "B-SINTOMA": 1, "I-SINTOMA": 2, "O": 0 }, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.44.2", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50262 } [INFO|configuration_utils.py:733] 2024-09-09 11:54:07,264 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json [INFO|configuration_utils.py:800] 2024-09-09 11:54:07,265 >> Model config RobertaConfig { "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es", "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "classifier_dropout": null, "eos_token_id": 2, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.44.2", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50262 } [INFO|tokenization_utils_base.py:2269] 2024-09-09 11:54:07,275 >> loading file vocab.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/vocab.json [INFO|tokenization_utils_base.py:2269] 2024-09-09 11:54:07,275 >> loading file merges.txt from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/merges.txt [INFO|tokenization_utils_base.py:2269] 2024-09-09 11:54:07,275 >> loading file tokenizer.json from cache at None [INFO|tokenization_utils_base.py:2269] 2024-09-09 11:54:07,275 >> loading file added_tokens.json from cache at None [INFO|tokenization_utils_base.py:2269] 2024-09-09 11:54:07,275 >> loading file special_tokens_map.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/special_tokens_map.json [INFO|tokenization_utils_base.py:2269] 2024-09-09 11:54:07,275 >> loading file tokenizer_config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/tokenizer_config.json [INFO|configuration_utils.py:733] 2024-09-09 11:54:07,275 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json [INFO|configuration_utils.py:800] 2024-09-09 11:54:07,276 >> Model config RobertaConfig { "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es", "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "classifier_dropout": null, "eos_token_id": 2, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.44.2", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50262 } /usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 warnings.warn( [INFO|configuration_utils.py:733] 2024-09-09 11:54:07,353 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json [INFO|configuration_utils.py:800] 2024-09-09 11:54:07,354 >> Model config RobertaConfig { "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es", "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "classifier_dropout": null, "eos_token_id": 2, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.44.2", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50262 } [INFO|modeling_utils.py:3678] 2024-09-09 11:54:07,676 >> loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/pytorch_model.bin [INFO|modeling_utils.py:4497] 2024-09-09 11:54:07,755 >> Some weights of the model checkpoint at PlanTL-GOB-ES/bsc-bio-ehr-es were not used when initializing RobertaForTokenClassification: ['lm_head.bias', 'lm_head.decoder.bias', 'lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight'] - This IS expected if you are initializing RobertaForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing RobertaForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). [WARNING|modeling_utils.py:4509] 2024-09-09 11:54:07,755 >> Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at PlanTL-GOB-ES/bsc-bio-ehr-es and are newly initialized: ['classifier.bias', 'classifier.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Map: 0%| | 0/13013 [00:00> The following columns in the training set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:2134] 2024-09-09 11:54:12,775 >> ***** Running training ***** [INFO|trainer.py:2135] 2024-09-09 11:54:12,776 >> Num examples = 13,013 [INFO|trainer.py:2136] 2024-09-09 11:54:12,776 >> Num Epochs = 10 [INFO|trainer.py:2137] 2024-09-09 11:54:12,776 >> Instantaneous batch size per device = 32 [INFO|trainer.py:2140] 2024-09-09 11:54:12,776 >> Total train batch size (w. parallel, distributed & accumulation) = 64 [INFO|trainer.py:2141] 2024-09-09 11:54:12,776 >> Gradient Accumulation steps = 2 [INFO|trainer.py:2142] 2024-09-09 11:54:12,776 >> Total optimization steps = 2,030 [INFO|trainer.py:2143] 2024-09-09 11:54:12,776 >> Number of trainable parameters = 124,055,043 0%| | 0/2030 [00:00> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 11:55:48,644 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 11:55:48,644 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 11:55:48,644 >> Batch size = 8 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-203 [INFO|configuration_utils.py:472] 2024-09-09 11:55:54,553 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-203/config.json [INFO|modeling_utils.py:2799] 2024-09-09 11:55:55,568 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-203/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 11:55:55,569 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-203/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 11:55:55,569 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-203/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-09 11:56:00,182 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 11:56:00,183 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 10%|█ | 204/2030 [01:47<1:59:40, 3.93s/it] 10%|█ | 205/2030 [01:48<1:28:45, 2.92s/it] 10%|█ | 206/2030 [01:48<1:06:21, 2.18s/it] 10%|█ | 207/2030 [01:49<50:53, 1.67s/it] 10%|█ | 208/2030 [01:49<39:22, 1.30s/it] 10%|█ | 209/2030 [01:49<30:44, 1.01s/it] 10%|█ | 210/2030 [01:50<25:37, 1.18it/s] 10%|█ | 211/2030 [01:51<24:14, 1.25it/s] 10%|█ | 212/2030 [01:51<21:03, 1.44it/s] 10%|█ | 213/2030 [01:51<18:36, 1.63it/s] 11%|█ | 214/2030 [01:52<17:59, 1.68it/s] 11%|█ | 215/2030 [01:52<16:37, 1.82it/s] 11%|█ | 216/2030 [01:53<15:39, 1.93it/s] 11%|█ | 217/2030 [01:54<19:55, 1.52it/s] 11%|█ | 218/2030 [01:54<17:23, 1.74it/s] 11%|█ | 219/2030 [01:55<16:28, 1.83it/s] 11%|█ | 220/2030 [01:55<17:39, 1.71it/s] 11%|█ | 221/2030 [01:56<15:31, 1.94it/s] 11%|█ | 222/2030 [01:56<14:36, 2.06it/s] 11%|█ | 223/2030 [01:57<14:18, 2.11it/s] 11%|█ | 224/2030 [01:57<16:05, 1.87it/s] 11%|█ | 225/2030 [01:58<15:11, 1.98it/s] 11%|█ | 226/2030 [01:58<14:18, 2.10it/s] 11%|█ | 227/2030 [01:59<14:05, 2.13it/s] 11%|█ | 228/2030 [01:59<13:34, 2.21it/s] 11%|█▏ | 229/2030 [01:59<12:44, 2.36it/s] 11%|█▏ | 230/2030 [02:00<12:42, 2.36it/s] 11%|█▏ | 231/2030 [02:00<12:22, 2.42it/s] 11%|█▏ | 232/2030 [02:01<12:16, 2.44it/s] 11%|█▏ | 233/2030 [02:01<12:45, 2.35it/s] 12%|█▏ | 234/2030 [02:02<14:11, 2.11it/s] 12%|█▏ | 235/2030 [02:02<13:37, 2.20it/s] 12%|█▏ | 236/2030 [02:02<13:13, 2.26it/s] 12%|█▏ | 237/2030 [02:03<13:48, 2.16it/s] 12%|█▏ | 238/2030 [02:03<13:05, 2.28it/s] 12%|█▏ | 239/2030 [02:04<12:34, 2.37it/s] 12%|█▏ | 240/2030 [02:04<12:38, 2.36it/s] 12%|█▏ | 241/2030 [02:05<12:12, 2.44it/s] 12%|█▏ | 242/2030 [02:05<12:03, 2.47it/s] 12%|█▏ | 243/2030 [02:05<13:09, 2.26it/s] 12%|█▏ | 244/2030 [02:06<13:57, 2.13it/s] 12%|█▏ | 245/2030 [02:06<12:28, 2.39it/s] 12%|█▏ | 246/2030 [02:07<11:47, 2.52it/s] 12%|█▏ | 247/2030 [02:07<12:04, 2.46it/s] 12%|█▏ | 248/2030 [02:07<11:32, 2.57it/s] 12%|█▏ | 249/2030 [02:08<12:22, 2.40it/s] 12%|█▏ | 250/2030 [02:08<12:07, 2.45it/s] 12%|█▏ | 251/2030 [02:09<11:40, 2.54it/s] 12%|█▏ | 252/2030 [02:09<14:13, 2.08it/s] 12%|█▏ | 253/2030 [02:10<15:51, 1.87it/s] 13%|█▎ | 254/2030 [02:10<14:48, 2.00it/s] 13%|█▎ | 255/2030 [02:11<16:12, 1.83it/s] 13%|█▎ | 256/2030 [02:12<15:55, 1.86it/s] 13%|█▎ | 257/2030 [02:12<15:12, 1.94it/s] 13%|█▎ | 258/2030 [02:13<16:09, 1.83it/s] 13%|█▎ | 259/2030 [02:13<14:20, 2.06it/s] 13%|█▎ | 260/2030 [02:13<13:31, 2.18it/s] 13%|█▎ | 261/2030 [02:14<16:16, 1.81it/s] 13%|█▎ | 262/2030 [02:15<14:54, 1.98it/s] 13%|█▎ | 263/2030 [02:15<14:32, 2.03it/s] 13%|█▎ | 264/2030 [02:16<15:26, 1.91it/s] 13%|█▎ | 265/2030 [02:16<14:04, 2.09it/s] 13%|█▎ | 266/2030 [02:16<13:00, 2.26it/s] 13%|█▎ | 267/2030 [02:17<12:11, 2.41it/s] 13%|█▎ | 268/2030 [02:17<12:46, 2.30it/s] 13%|█▎ | 269/2030 [02:18<11:57, 2.46it/s] 13%|█▎ | 270/2030 [02:18<12:20, 2.38it/s] 13%|█▎ | 271/2030 [02:19<12:47, 2.29it/s] 13%|█▎ | 272/2030 [02:19<13:52, 2.11it/s] 13%|█▎ | 273/2030 [02:20<13:54, 2.11it/s] 13%|█▎ | 274/2030 [02:20<12:44, 2.30it/s] 14%|█▎ | 275/2030 [02:20<12:38, 2.31it/s] 14%|█▎ | 276/2030 [02:21<12:42, 2.30it/s] 14%|█▎ | 277/2030 [02:21<12:34, 2.32it/s] 14%|█▎ | 278/2030 [02:22<12:30, 2.33it/s] 14%|█▎ | 279/2030 [02:22<12:28, 2.34it/s] 14%|█▍ | 280/2030 [02:22<11:52, 2.46it/s] 14%|█▍ | 281/2030 [02:23<12:37, 2.31it/s] 14%|█▍ | 282/2030 [02:23<12:16, 2.37it/s] 14%|█▍ | 283/2030 [02:24<12:33, 2.32it/s] 14%|█▍ | 284/2030 [02:24<13:13, 2.20it/s] 14%|█▍ | 285/2030 [02:25<13:08, 2.21it/s] 14%|█▍ | 286/2030 [02:25<13:09, 2.21it/s] 14%|█▍ | 287/2030 [02:26<12:27, 2.33it/s] 14%|█▍ | 288/2030 [02:26<13:57, 2.08it/s] 14%|█▍ | 289/2030 [02:27<13:51, 2.09it/s] 14%|█▍ | 290/2030 [02:27<15:16, 1.90it/s] 14%|█▍ | 291/2030 [02:28<14:18, 2.02it/s] 14%|█▍ | 292/2030 [02:28<15:12, 1.91it/s] 14%|█▍ | 293/2030 [02:29<14:05, 2.05it/s] 14%|█▍ | 294/2030 [02:29<15:37, 1.85it/s] 15%|█▍ | 295/2030 [02:30<13:51, 2.09it/s] 15%|█▍ | 296/2030 [02:30<13:08, 2.20it/s] 15%|█▍ | 297/2030 [02:31<13:30, 2.14it/s] 15%|█▍ | 298/2030 [02:31<13:08, 2.20it/s] 15%|█▍ | 299/2030 [02:31<13:26, 2.15it/s] 15%|█▍ | 300/2030 [02:32<14:33, 1.98it/s] 15%|█▍ | 301/2030 [02:32<14:07, 2.04it/s] 15%|█▍ | 302/2030 [02:33<13:49, 2.08it/s] 15%|█▍ | 303/2030 [02:33<12:42, 2.26it/s] 15%|█▍ | 304/2030 [02:34<11:59, 2.40it/s] 15%|█▌ | 305/2030 [02:34<14:03, 2.04it/s] 15%|█▌ | 306/2030 [02:35<13:33, 2.12it/s] 15%|█▌ | 307/2030 [02:35<13:04, 2.20it/s] 15%|█▌ | 308/2030 [02:36<13:12, 2.17it/s] 15%|█▌ | 309/2030 [02:36<12:55, 2.22it/s] 15%|█▌ | 310/2030 [02:37<13:23, 2.14it/s] 15%|█▌ | 311/2030 [02:37<12:53, 2.22it/s] 15%|█▌ | 312/2030 [02:37<12:41, 2.25it/s] 15%|█▌ | 313/2030 [02:38<13:26, 2.13it/s] 15%|█▌ | 314/2030 [02:38<13:53, 2.06it/s] 16%|█▌ | 315/2030 [02:39<13:04, 2.19it/s] 16%|█▌ | 316/2030 [02:39<13:44, 2.08it/s] 16%|█▌ | 317/2030 [02:40<14:44, 1.94it/s] 16%|█▌ | 318/2030 [02:40<14:00, 2.04it/s] 16%|█▌ | 319/2030 [02:41<13:12, 2.16it/s] 16%|█▌ | 320/2030 [02:41<13:30, 2.11it/s] 16%|█▌ | 321/2030 [02:42<13:18, 2.14it/s] 16%|█▌ | 322/2030 [02:42<14:01, 2.03it/s] 16%|█▌ | 323/2030 [02:43<13:30, 2.11it/s] 16%|█▌ | 324/2030 [02:43<13:54, 2.04it/s] 16%|█▌ | 325/2030 [02:44<12:44, 2.23it/s] 16%|█▌ | 326/2030 [02:44<12:54, 2.20it/s] 16%|█▌ | 327/2030 [02:45<14:11, 2.00it/s] 16%|█▌ | 328/2030 [02:45<13:37, 2.08it/s] 16%|█▌ | 329/2030 [02:45<12:07, 2.34it/s] 16%|█▋ | 330/2030 [02:46<11:46, 2.41it/s] 16%|█▋ | 331/2030 [02:46<11:52, 2.38it/s] 16%|█▋ | 332/2030 [02:47<12:06, 2.34it/s] 16%|█▋ | 333/2030 [02:47<12:21, 2.29it/s] 16%|█▋ | 334/2030 [02:48<12:17, 2.30it/s] 17%|█▋ | 335/2030 [02:48<15:27, 1.83it/s] 17%|█▋ | 336/2030 [02:49<14:13, 1.98it/s] 17%|█▋ | 337/2030 [02:49<14:16, 1.98it/s] 17%|█▋ | 338/2030 [02:50<13:03, 2.16it/s] 17%|█▋ | 339/2030 [02:50<12:44, 2.21it/s] 17%|█▋ | 340/2030 [02:51<17:41, 1.59it/s] 17%|█▋ | 341/2030 [02:52<18:04, 1.56it/s] 17%|█▋ | 342/2030 [02:52<16:49, 1.67it/s] 17%|█▋ | 343/2030 [02:53<15:23, 1.83it/s] 17%|█▋ | 344/2030 [02:53<13:53, 2.02it/s] 17%|█▋ | 345/2030 [02:54<15:18, 1.83it/s] 17%|█▋ | 346/2030 [02:54<14:12, 1.98it/s] 17%|█▋ | 347/2030 [02:55<13:19, 2.11it/s] 17%|█▋ | 348/2030 [02:55<13:30, 2.08it/s] 17%|█▋ | 349/2030 [02:56<14:05, 1.99it/s] 17%|█▋ | 350/2030 [02:56<13:39, 2.05it/s] 17%|█▋ | 351/2030 [02:57<13:26, 2.08it/s] 17%|█▋ | 352/2030 [02:57<13:56, 2.01it/s] 17%|█▋ | 353/2030 [02:58<16:18, 1.71it/s] 17%|█▋ | 354/2030 [02:58<15:13, 1.84it/s] 17%|█▋ | 355/2030 [02:59<14:56, 1.87it/s] 18%|█▊ | 356/2030 [02:59<14:51, 1.88it/s] 18%|█▊ | 357/2030 [03:00<14:00, 1.99it/s] 18%|█▊ | 358/2030 [03:00<14:26, 1.93it/s] 18%|█▊ | 359/2030 [03:01<13:47, 2.02it/s] 18%|█▊ | 360/2030 [03:01<12:53, 2.16it/s] 18%|█▊ | 361/2030 [03:02<13:00, 2.14it/s] 18%|█▊ | 362/2030 [03:02<12:09, 2.29it/s] 18%|█▊ | 363/2030 [03:02<11:28, 2.42it/s] 18%|█▊ | 364/2030 [03:03<11:51, 2.34it/s] 18%|█▊ | 365/2030 [03:03<12:11, 2.28it/s] 18%|█▊ | 366/2030 [03:04<12:25, 2.23it/s] 18%|█▊ | 367/2030 [03:04<11:17, 2.45it/s] 18%|█▊ | 368/2030 [03:05<17:17, 1.60it/s] 18%|█▊ | 369/2030 [03:06<16:18, 1.70it/s] 18%|█▊ | 370/2030 [03:06<16:17, 1.70it/s] 18%|█▊ | 371/2030 [03:07<15:54, 1.74it/s] 18%|█▊ | 372/2030 [03:07<15:53, 1.74it/s] 18%|█▊ | 373/2030 [03:08<13:59, 1.97it/s] 18%|█▊ | 374/2030 [03:08<13:06, 2.11it/s] 18%|█▊ | 375/2030 [03:09<12:35, 2.19it/s] 19%|█▊ | 376/2030 [03:09<13:07, 2.10it/s] 19%|█▊ | 377/2030 [03:09<12:02, 2.29it/s] 19%|█▊ | 378/2030 [03:10<12:18, 2.24it/s] 19%|█▊ | 379/2030 [03:10<11:32, 2.39it/s] 19%|█▊ | 380/2030 [03:11<12:03, 2.28it/s] 19%|█▉ | 381/2030 [03:11<12:14, 2.24it/s] 19%|█▉ | 382/2030 [03:12<11:44, 2.34it/s] 19%|█▉ | 383/2030 [03:12<11:36, 2.36it/s] 19%|█▉ | 384/2030 [03:12<10:45, 2.55it/s] 19%|█▉ | 385/2030 [03:13<10:56, 2.51it/s] 19%|█▉ | 386/2030 [03:13<11:08, 2.46it/s] 19%|█▉ | 387/2030 [03:14<11:13, 2.44it/s] 19%|█▉ | 388/2030 [03:14<11:16, 2.43it/s] 19%|█▉ | 389/2030 [03:14<11:19, 2.42it/s] 19%|█▉ | 390/2030 [03:15<11:05, 2.46it/s] 19%|█▉ | 391/2030 [03:15<12:47, 2.13it/s] 19%|█▉ | 392/2030 [03:16<15:14, 1.79it/s] 19%|█▉ | 393/2030 [03:17<15:13, 1.79it/s] 19%|█▉ | 394/2030 [03:17<15:02, 1.81it/s] 19%|█▉ | 395/2030 [03:18<15:13, 1.79it/s] 20%|█▉ | 396/2030 [03:19<16:49, 1.62it/s] 20%|█▉ | 397/2030 [03:19<15:15, 1.78it/s] 20%|█▉ | 398/2030 [03:20<14:13, 1.91it/s] 20%|█▉ | 399/2030 [03:20<15:23, 1.77it/s] 20%|█▉ | 400/2030 [03:21<14:32, 1.87it/s] 20%|█▉ | 401/2030 [03:21<13:48, 1.97it/s] 20%|█▉ | 402/2030 [03:21<12:29, 2.17it/s] 20%|█▉ | 403/2030 [03:22<12:24, 2.18it/s] 20%|█▉ | 404/2030 [03:22<12:28, 2.17it/s] 20%|█▉ | 405/2030 [03:23<11:54, 2.27it/s] 20%|██ | 406/2030 [03:23<12:50, 2.11it/s] 20%|██ | 407/2030 [03:24<12:00, 2.25it/s][INFO|trainer.py:811] 2024-09-09 11:57:36,964 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 11:57:36,967 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 11:57:36,967 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 11:57:36,967 >> Batch size = 8 {'eval_loss': 0.15010379254817963, 'eval_precision': 0.5959855892949047, 'eval_recall': 0.6338259441707718, 'eval_f1': 0.6143236074270556, 'eval_accuracy': 0.9467740383072925, 'eval_runtime': 5.907, 'eval_samples_per_second': 426.445, 'eval_steps_per_second': 53.327, 'epoch': 1.0} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-407 [INFO|configuration_utils.py:472] 2024-09-09 11:57:42,863 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-407/config.json [INFO|modeling_utils.py:2799] 2024-09-09 11:57:43,887 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-407/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 11:57:43,888 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-407/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 11:57:43,888 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-407/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-09 11:57:48,014 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 11:57:48,015 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 20%|██ | 408/2030 [03:35<1:42:07, 3.78s/it] 20%|██ | 409/2030 [03:36<1:14:36, 2.76s/it] 20%|██ | 410/2030 [03:36<55:53, 2.07s/it] 20%|██ | 411/2030 [03:37<43:27, 1.61s/it] 20%|██ | 412/2030 [03:37<33:47, 1.25s/it] 20%|██ | 413/2030 [03:37<27:05, 1.01s/it] 20%|██ | 414/2030 [03:38<21:49, 1.23it/s] 20%|██ | 415/2030 [03:38<18:01, 1.49it/s] 20%|██ | 416/2030 [03:39<17:07, 1.57it/s] 21%|██ | 417/2030 [03:39<14:44, 1.82it/s] 21%|██ | 418/2030 [03:39<13:20, 2.01it/s] 21%|██ | 419/2030 [03:40<12:25, 2.16it/s] 21%|██ | 420/2030 [03:40<11:35, 2.31it/s] 21%|██ | 421/2030 [03:41<11:46, 2.28it/s] 21%|██ | 422/2030 [03:41<11:29, 2.33it/s] 21%|██ | 423/2030 [03:41<10:51, 2.47it/s] 21%|██ | 424/2030 [03:42<11:17, 2.37it/s] 21%|██ | 425/2030 [03:42<10:51, 2.46it/s] 21%|██ | 426/2030 [03:43<10:09, 2.63it/s] 21%|██ | 427/2030 [03:43<11:34, 2.31it/s] 21%|██ | 428/2030 [03:43<11:15, 2.37it/s] 21%|██ | 429/2030 [03:44<11:04, 2.41it/s] 21%|██ | 430/2030 [03:44<11:52, 2.24it/s] 21%|██ | 431/2030 [03:45<12:22, 2.15it/s] 21%|██▏ | 432/2030 [03:45<12:04, 2.21it/s] 21%|██▏ | 433/2030 [03:46<12:06, 2.20it/s] 21%|██▏ | 434/2030 [03:46<11:49, 2.25it/s] 21%|██▏ | 435/2030 [03:47<11:40, 2.28it/s] 21%|██▏ | 436/2030 [03:47<11:38, 2.28it/s] 22%|██▏ | 437/2030 [03:48<11:43, 2.26it/s] 22%|██▏ | 438/2030 [03:48<11:49, 2.24it/s] 22%|██▏ | 439/2030 [03:48<11:21, 2.33it/s] 22%|██▏ | 440/2030 [03:49<11:26, 2.32it/s] 22%|██▏ | 441/2030 [03:49<11:26, 2.32it/s] 22%|██▏ | 442/2030 [03:50<11:27, 2.31it/s] 22%|██▏ | 443/2030 [03:50<10:59, 2.41it/s] 22%|██▏ | 444/2030 [03:51<11:59, 2.20it/s] 22%|██▏ | 445/2030 [03:51<11:41, 2.26it/s] 22%|██▏ | 446/2030 [03:52<12:32, 2.10it/s] 22%|██▏ | 447/2030 [03:52<13:39, 1.93it/s] 22%|██▏ | 448/2030 [03:53<12:16, 2.15it/s] 22%|██▏ | 449/2030 [03:53<12:03, 2.19it/s] 22%|██▏ | 450/2030 [03:53<12:04, 2.18it/s] 22%|██▏ | 451/2030 [03:54<12:47, 2.06it/s] 22%|██▏ | 452/2030 [03:55<14:24, 1.83it/s] 22%|██▏ | 453/2030 [03:55<15:25, 1.70it/s] 22%|██▏ | 454/2030 [03:56<14:59, 1.75it/s] 22%|██▏ | 455/2030 [03:56<13:25, 1.96it/s] 22%|██▏ | 456/2030 [03:57<13:25, 1.95it/s] 23%|██▎ | 457/2030 [03:57<13:00, 2.02it/s] 23%|██▎ | 458/2030 [03:58<13:08, 1.99it/s] 23%|██▎ | 459/2030 [03:58<13:54, 1.88it/s] 23%|██▎ | 460/2030 [03:59<13:24, 1.95it/s] 23%|██▎ | 461/2030 [03:59<13:11, 1.98it/s] 23%|██▎ | 462/2030 [04:00<12:12, 2.14it/s] 23%|██▎ | 463/2030 [04:00<13:31, 1.93it/s] 23%|██▎ | 464/2030 [04:01<12:23, 2.11it/s] 23%|██▎ | 465/2030 [04:01<11:55, 2.19it/s] 23%|██▎ | 466/2030 [04:02<11:24, 2.29it/s] 23%|██▎ | 467/2030 [04:02<11:04, 2.35it/s] 23%|██▎ | 468/2030 [04:02<11:07, 2.34it/s] 23%|██▎ | 469/2030 [04:03<15:26, 1.68it/s] 23%|██▎ | 470/2030 [04:04<14:11, 1.83it/s] 23%|██▎ | 471/2030 [04:04<12:33, 2.07it/s] 23%|██▎ | 472/2030 [04:05<13:26, 1.93it/s] 23%|██▎ | 473/2030 [04:05<12:45, 2.03it/s] 23%|██▎ | 474/2030 [04:05<11:48, 2.20it/s] 23%|██▎ | 475/2030 [04:06<13:34, 1.91it/s] 23%|██▎ | 476/2030 [04:07<12:11, 2.12it/s] 23%|██▎ | 477/2030 [04:07<12:14, 2.11it/s] 24%|██▎ | 478/2030 [04:08<12:59, 1.99it/s] 24%|██▎ | 479/2030 [04:08<12:50, 2.01it/s] 24%|██▎ | 480/2030 [04:08<12:15, 2.11it/s] 24%|██▎ | 481/2030 [04:09<11:12, 2.30it/s] 24%|██▎ | 482/2030 [04:09<10:51, 2.38it/s] 24%|██▍ | 483/2030 [04:10<10:58, 2.35it/s] 24%|██▍ | 484/2030 [04:10<10:34, 2.43it/s] 24%|██▍ | 485/2030 [04:10<10:57, 2.35it/s] 24%|██▍ | 486/2030 [04:11<11:17, 2.28it/s] 24%|██▍ | 487/2030 [04:11<10:37, 2.42it/s] 24%|██▍ | 488/2030 [04:12<10:35, 2.43it/s] 24%|██▍ | 489/2030 [04:12<12:24, 2.07it/s] 24%|██▍ | 490/2030 [04:13<13:26, 1.91it/s] 24%|██▍ | 491/2030 [04:13<12:28, 2.06it/s] 24%|██▍ | 492/2030 [04:14<12:05, 2.12it/s] 24%|██▍ | 493/2030 [04:14<11:17, 2.27it/s] 24%|██▍ | 494/2030 [04:15<11:15, 2.27it/s] 24%|██▍ | 495/2030 [04:15<10:57, 2.33it/s] 24%|██▍ | 496/2030 [04:15<11:16, 2.27it/s] 24%|██▍ | 497/2030 [04:16<11:11, 2.28it/s] 25%|██▍ | 498/2030 [04:17<12:23, 2.06it/s] 25%|██▍ | 499/2030 [04:17<10:59, 2.32it/s] 25%|██▍ | 500/2030 [04:18<13:30, 1.89it/s] 25%|██▍ | 500/2030 [04:18<13:30, 1.89it/s] 25%|██▍ | 501/2030 [04:18<12:28, 2.04it/s] 25%|██▍ | 502/2030 [04:18<12:10, 2.09it/s] 25%|██▍ | 503/2030 [04:19<11:13, 2.27it/s] 25%|██▍ | 504/2030 [04:19<10:48, 2.35it/s] 25%|██▍ | 505/2030 [04:20<10:15, 2.48it/s] 25%|██▍ | 506/2030 [04:20<10:00, 2.54it/s] 25%|██▍ | 507/2030 [04:20<11:06, 2.28it/s] 25%|██▌ | 508/2030 [04:21<11:14, 2.26it/s] 25%|██▌ | 509/2030 [04:21<11:39, 2.18it/s] 25%|██▌ | 510/2030 [04:22<11:35, 2.18it/s] 25%|██▌ | 511/2030 [04:22<11:52, 2.13it/s] 25%|██▌ | 512/2030 [04:23<10:57, 2.31it/s] 25%|██▌ | 513/2030 [04:23<11:41, 2.16it/s] 25%|██▌ | 514/2030 [04:24<10:46, 2.34it/s] 25%|██▌ | 515/2030 [04:24<10:03, 2.51it/s] 25%|██▌ | 516/2030 [04:24<10:01, 2.52it/s] 25%|██▌ | 517/2030 [04:25<10:02, 2.51it/s] 26%|██▌ | 518/2030 [04:25<09:28, 2.66it/s] 26%|██▌ | 519/2030 [04:25<09:36, 2.62it/s] 26%|██▌ | 520/2030 [04:26<10:16, 2.45it/s] 26%|██▌ | 521/2030 [04:26<10:18, 2.44it/s] 26%|██▌ | 522/2030 [04:27<11:24, 2.20it/s] 26%|██▌ | 523/2030 [04:27<12:07, 2.07it/s] 26%|██▌ | 524/2030 [04:28<11:26, 2.19it/s] 26%|██▌ | 525/2030 [04:28<10:51, 2.31it/s] 26%|██▌ | 526/2030 [04:29<12:02, 2.08it/s] 26%|██▌ | 527/2030 [04:29<11:56, 2.10it/s] 26%|██▌ | 528/2030 [04:30<11:10, 2.24it/s] 26%|██▌ | 529/2030 [04:30<11:17, 2.22it/s] 26%|██▌ | 530/2030 [04:31<11:18, 2.21it/s] 26%|██▌ | 531/2030 [04:31<11:00, 2.27it/s] 26%|██▌ | 532/2030 [04:31<10:29, 2.38it/s] 26%|██▋ | 533/2030 [04:32<10:29, 2.38it/s] 26%|██▋ | 534/2030 [04:32<09:53, 2.52it/s] 26%|██▋ | 535/2030 [04:32<10:05, 2.47it/s] 26%|██▋ | 536/2030 [04:33<09:42, 2.56it/s] 26%|██▋ | 537/2030 [04:33<10:24, 2.39it/s] 27%|██▋ | 538/2030 [04:34<10:46, 2.31it/s] 27%|██▋ | 539/2030 [04:34<10:34, 2.35it/s] 27%|██▋ | 540/2030 [04:35<10:10, 2.44it/s] 27%|██▋ | 541/2030 [04:35<10:55, 2.27it/s] 27%|██▋ | 542/2030 [04:35<10:28, 2.37it/s] 27%|██▋ | 543/2030 [04:36<09:55, 2.50it/s] 27%|██▋ | 544/2030 [04:37<12:06, 2.05it/s] 27%|██▋ | 545/2030 [04:37<11:51, 2.09it/s] 27%|██▋ | 546/2030 [04:37<11:59, 2.06it/s] 27%|██▋ | 547/2030 [04:38<11:23, 2.17it/s] 27%|██▋ | 548/2030 [04:38<11:56, 2.07it/s] 27%|██▋ | 549/2030 [04:39<11:42, 2.11it/s] 27%|██▋ | 550/2030 [04:39<11:10, 2.21it/s] 27%|██▋ | 551/2030 [04:40<12:14, 2.01it/s] 27%|██▋ | 552/2030 [04:40<11:45, 2.09it/s] 27%|██▋ | 553/2030 [04:41<10:46, 2.29it/s] 27%|██▋ | 554/2030 [04:41<10:22, 2.37it/s] 27%|██▋ | 555/2030 [04:42<13:25, 1.83it/s] 27%|██▋ | 556/2030 [04:42<12:29, 1.97it/s] 27%|██▋ | 557/2030 [04:43<11:54, 2.06it/s] 27%|██▋ | 558/2030 [04:43<11:29, 2.14it/s] 28%|██▊ | 559/2030 [04:44<12:41, 1.93it/s] 28%|██▊ | 560/2030 [04:44<11:43, 2.09it/s] 28%|██▊ | 561/2030 [04:45<12:54, 1.90it/s] 28%|██▊ | 562/2030 [04:45<13:00, 1.88it/s] 28%|██▊ | 563/2030 [04:46<13:09, 1.86it/s] 28%|██▊ | 564/2030 [04:46<13:16, 1.84it/s] 28%|██▊ | 565/2030 [04:47<12:09, 2.01it/s] 28%|██▊ | 566/2030 [04:47<11:21, 2.15it/s] 28%|██▊ | 567/2030 [04:48<11:22, 2.14it/s] 28%|██▊ | 568/2030 [04:48<11:05, 2.20it/s] 28%|██▊ | 569/2030 [04:49<11:16, 2.16it/s] 28%|██▊ | 570/2030 [04:49<11:09, 2.18it/s] 28%|██▊ | 571/2030 [04:49<10:46, 2.26it/s] 28%|██▊ | 572/2030 [04:50<10:59, 2.21it/s] 28%|██▊ | 573/2030 [04:50<10:29, 2.31it/s] 28%|██▊ | 574/2030 [04:51<10:08, 2.39it/s] 28%|██▊ | 575/2030 [04:51<10:14, 2.37it/s] 28%|██▊ | 576/2030 [04:52<09:59, 2.42it/s] 28%|██▊ | 577/2030 [04:52<12:18, 1.97it/s] 28%|██▊ | 578/2030 [04:53<12:25, 1.95it/s] 29%|██▊ | 579/2030 [04:53<11:44, 2.06it/s] 29%|██▊ | 580/2030 [04:54<12:14, 1.97it/s] 29%|██▊ | 581/2030 [04:54<11:54, 2.03it/s] 29%|██▊ | 582/2030 [04:55<12:50, 1.88it/s] 29%|██▊ | 583/2030 [04:55<12:13, 1.97it/s] 29%|██▉ | 584/2030 [04:56<11:23, 2.11it/s] 29%|██▉ | 585/2030 [04:56<11:37, 2.07it/s] 29%|██▉ | 586/2030 [04:57<11:00, 2.19it/s] 29%|██▉ | 587/2030 [04:57<10:51, 2.21it/s] 29%|██▉ | 588/2030 [04:58<11:13, 2.14it/s] 29%|██▉ | 589/2030 [04:58<11:11, 2.15it/s] 29%|██▉ | 590/2030 [04:58<10:27, 2.30it/s] 29%|██▉ | 591/2030 [04:59<10:22, 2.31it/s] 29%|██▉ | 592/2030 [05:00<13:37, 1.76it/s] 29%|██▉ | 593/2030 [05:00<12:36, 1.90it/s] 29%|██▉ | 594/2030 [05:01<14:48, 1.62it/s] 29%|██▉ | 595/2030 [05:01<13:04, 1.83it/s] 29%|██▉ | 596/2030 [05:02<12:20, 1.94it/s] 29%|██▉ | 597/2030 [05:03<14:06, 1.69it/s] 29%|██▉ | 598/2030 [05:03<13:25, 1.78it/s] 30%|██▉ | 599/2030 [05:04<13:41, 1.74it/s] 30%|██▉ | 600/2030 [05:04<12:55, 1.84it/s] 30%|██▉ | 601/2030 [05:05<12:45, 1.87it/s] 30%|██▉ | 602/2030 [05:05<11:08, 2.14it/s] 30%|██▉ | 603/2030 [05:05<11:30, 2.07it/s] 30%|██▉ | 604/2030 [05:06<11:35, 2.05it/s] 30%|██▉ | 605/2030 [05:06<10:59, 2.16it/s] 30%|██▉ | 606/2030 [05:07<10:20, 2.29it/s] 30%|██▉ | 607/2030 [05:07<09:23, 2.53it/s] 30%|██▉ | 608/2030 [05:08<10:06, 2.35it/s] 30%|███ | 609/2030 [05:08<10:17, 2.30it/s] 30%|███ | 610/2030 [05:08<10:16, 2.30it/s][INFO|trainer.py:811] 2024-09-09 11:59:21,842 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 11:59:21,844 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 11:59:21,844 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 11:59:21,844 >> Batch size = 8 {'eval_loss': 0.17612887918949127, 'eval_precision': 0.6529351184346035, 'eval_recall': 0.6940339354132458, 'eval_f1': 0.6728575218890952, 'eval_accuracy': 0.949244441592608, 'eval_runtime': 5.8933, 'eval_samples_per_second': 427.436, 'eval_steps_per_second': 53.451, 'epoch': 2.0} {'loss': 0.1312, 'grad_norm': 0.6181371212005615, 'learning_rate': 3.768472906403941e-05, 'epoch': 2.46} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-610 [INFO|configuration_utils.py:472] 2024-09-09 11:59:27,692 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-610/config.json [INFO|modeling_utils.py:2799] 2024-09-09 11:59:28,717 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-610/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 11:59:28,718 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-610/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 11:59:28,718 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-610/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-09 11:59:31,818 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 11:59:31,819 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 30%|███ | 611/2030 [05:19<1:20:23, 3.40s/it] 30%|███ | 612/2030 [05:19<59:41, 2.53s/it] 30%|███ | 613/2030 [05:20<47:12, 2.00s/it] 30%|███ | 614/2030 [05:20<35:37, 1.51s/it] 30%|███ | 615/2030 [05:21<27:24, 1.16s/it] 30%|███ | 616/2030 [05:21<23:10, 1.02it/s] 30%|███ | 617/2030 [05:22<19:02, 1.24it/s] 30%|███ | 618/2030 [05:22<17:57, 1.31it/s] 30%|███ | 619/2030 [05:23<15:09, 1.55it/s] 31%|███ | 620/2030 [05:23<13:12, 1.78it/s] 31%|███ | 621/2030 [05:23<11:55, 1.97it/s] 31%|███ | 622/2030 [05:24<11:43, 2.00it/s] 31%|███ | 623/2030 [05:24<10:55, 2.15it/s] 31%|███ | 624/2030 [05:25<11:36, 2.02it/s] 31%|███ | 625/2030 [05:26<13:00, 1.80it/s] 31%|███ | 626/2030 [05:26<12:31, 1.87it/s] 31%|███ | 627/2030 [05:26<11:20, 2.06it/s] 31%|███ | 628/2030 [05:27<10:13, 2.28it/s] 31%|███ | 629/2030 [05:27<10:31, 2.22it/s] 31%|███ | 630/2030 [05:28<10:39, 2.19it/s] 31%|███ | 631/2030 [05:29<12:59, 1.79it/s] 31%|███ | 632/2030 [05:29<12:03, 1.93it/s] 31%|███ | 633/2030 [05:29<11:21, 2.05it/s] 31%|███ | 634/2030 [05:30<11:31, 2.02it/s] 31%|███▏ | 635/2030 [05:30<11:03, 2.10it/s] 31%|███▏ | 636/2030 [05:31<10:53, 2.13it/s] 31%|███▏ | 637/2030 [05:31<10:19, 2.25it/s] 31%|███▏ | 638/2030 [05:32<09:58, 2.33it/s] 31%|███▏ | 639/2030 [05:32<10:21, 2.24it/s] 32%|███▏ | 640/2030 [05:32<09:53, 2.34it/s] 32%|███▏ | 641/2030 [05:33<09:42, 2.38it/s] 32%|███▏ | 642/2030 [05:33<10:11, 2.27it/s] 32%|███▏ | 643/2030 [05:34<10:48, 2.14it/s] 32%|███▏ | 644/2030 [05:34<10:16, 2.25it/s] 32%|███▏ | 645/2030 [05:35<09:57, 2.32it/s] 32%|███▏ | 646/2030 [05:35<09:37, 2.40it/s] 32%|███▏ | 647/2030 [05:35<10:13, 2.26it/s] 32%|███▏ | 648/2030 [05:36<09:30, 2.42it/s] 32%|███▏ | 649/2030 [05:36<09:54, 2.32it/s] 32%|███▏ | 650/2030 [05:37<10:46, 2.14it/s] 32%|███▏ | 651/2030 [05:37<09:52, 2.33it/s] 32%|███▏ | 652/2030 [05:38<10:02, 2.29it/s] 32%|███▏ | 653/2030 [05:38<09:50, 2.33it/s] 32%|███▏ | 654/2030 [05:39<10:09, 2.26it/s] 32%|███▏ | 655/2030 [05:39<11:45, 1.95it/s] 32%|███▏ | 656/2030 [05:40<11:24, 2.01it/s] 32%|███▏ | 657/2030 [05:40<11:31, 1.99it/s] 32%|███▏ | 658/2030 [05:41<11:21, 2.01it/s] 32%|███▏ | 659/2030 [05:41<11:08, 2.05it/s] 33%|███▎ | 660/2030 [05:42<11:03, 2.07it/s] 33%|███▎ | 661/2030 [05:42<11:14, 2.03it/s] 33%|███▎ | 662/2030 [05:43<10:51, 2.10it/s] 33%|███▎ | 663/2030 [05:43<11:08, 2.05it/s] 33%|███▎ | 664/2030 [05:43<10:24, 2.19it/s] 33%|███▎ | 665/2030 [05:44<10:40, 2.13it/s] 33%|███▎ | 666/2030 [05:44<09:49, 2.31it/s] 33%|███▎ | 667/2030 [05:45<09:04, 2.51it/s] 33%|███▎ | 668/2030 [05:45<09:51, 2.30it/s] 33%|███▎ | 669/2030 [05:46<10:02, 2.26it/s] 33%|███▎ | 670/2030 [05:46<10:05, 2.25it/s] 33%|███▎ | 671/2030 [05:46<09:36, 2.36it/s] 33%|███▎ | 672/2030 [05:47<09:17, 2.43it/s] 33%|███▎ | 673/2030 [05:47<09:15, 2.44it/s] 33%|███▎ | 674/2030 [05:48<11:19, 2.00it/s] 33%|███▎ | 675/2030 [05:48<10:49, 2.09it/s] 33%|███▎ | 676/2030 [05:49<09:53, 2.28it/s] 33%|███▎ | 677/2030 [05:49<09:54, 2.27it/s] 33%|███▎ | 678/2030 [05:50<10:02, 2.24it/s] 33%|███▎ | 679/2030 [05:50<09:58, 2.26it/s] 33%|███▎ | 680/2030 [05:51<10:56, 2.06it/s] 34%|███▎ | 681/2030 [05:51<10:20, 2.18it/s] 34%|███▎ | 682/2030 [05:51<09:39, 2.33it/s] 34%|███▎ | 683/2030 [05:52<09:30, 2.36it/s] 34%|███▎ | 684/2030 [05:52<09:44, 2.30it/s] 34%|███▎ | 685/2030 [05:53<10:04, 2.22it/s] 34%|███▍ | 686/2030 [05:53<09:32, 2.35it/s] 34%|███▍ | 687/2030 [05:53<08:54, 2.51it/s] 34%|███▍ | 688/2030 [05:54<08:44, 2.56it/s] 34%|███▍ | 689/2030 [05:54<09:17, 2.41it/s] 34%|███▍ | 690/2030 [05:55<10:05, 2.21it/s] 34%|███▍ | 691/2030 [05:55<09:26, 2.36it/s] 34%|███▍ | 692/2030 [05:56<09:09, 2.44it/s] 34%|███▍ | 693/2030 [05:56<08:54, 2.50it/s] 34%|███▍ | 694/2030 [05:56<08:48, 2.53it/s] 34%|███▍ | 695/2030 [05:57<09:44, 2.28it/s] 34%|███▍ | 696/2030 [05:57<09:04, 2.45it/s] 34%|███▍ | 697/2030 [05:58<08:55, 2.49it/s] 34%|███▍ | 698/2030 [05:58<09:18, 2.39it/s] 34%|███▍ | 699/2030 [05:59<09:39, 2.30it/s] 34%|███▍ | 700/2030 [05:59<09:36, 2.31it/s] 35%|███▍ | 701/2030 [05:59<09:42, 2.28it/s] 35%|███▍ | 702/2030 [06:00<10:37, 2.08it/s] 35%|███▍ | 703/2030 [06:00<10:19, 2.14it/s] 35%|███▍ | 704/2030 [06:01<10:02, 2.20it/s] 35%|███▍ | 705/2030 [06:01<10:05, 2.19it/s] 35%|███▍ | 706/2030 [06:02<09:31, 2.32it/s] 35%|███▍ | 707/2030 [06:02<09:23, 2.35it/s] 35%|███▍ | 708/2030 [06:03<11:26, 1.93it/s] 35%|███▍ | 709/2030 [06:03<10:21, 2.13it/s] 35%|███▍ | 710/2030 [06:04<10:09, 2.17it/s] 35%|███▌ | 711/2030 [06:04<09:53, 2.22it/s] 35%|███▌ | 712/2030 [06:05<09:55, 2.21it/s] 35%|███▌ | 713/2030 [06:05<09:29, 2.31it/s] 35%|███▌ | 714/2030 [06:05<09:21, 2.35it/s] 35%|███▌ | 715/2030 [06:06<09:37, 2.28it/s] 35%|███▌ | 716/2030 [06:06<10:07, 2.16it/s] 35%|███▌ | 717/2030 [06:07<10:00, 2.19it/s] 35%|███▌ | 718/2030 [06:07<10:00, 2.18it/s] 35%|███▌ | 719/2030 [06:08<09:28, 2.31it/s] 35%|███▌ | 720/2030 [06:08<09:34, 2.28it/s] 36%|███▌ | 721/2030 [06:08<09:35, 2.27it/s] 36%|███▌ | 722/2030 [06:09<09:20, 2.34it/s] 36%|███▌ | 723/2030 [06:09<09:34, 2.27it/s] 36%|███▌ | 724/2030 [06:10<09:50, 2.21it/s] 36%|███▌ | 725/2030 [06:10<09:40, 2.25it/s] 36%|███▌ | 726/2030 [06:11<10:50, 2.00it/s] 36%|███▌ | 727/2030 [06:11<10:34, 2.05it/s] 36%|███▌ | 728/2030 [06:12<10:37, 2.04it/s] 36%|███▌ | 729/2030 [06:12<11:24, 1.90it/s] 36%|███▌ | 730/2030 [06:13<10:46, 2.01it/s] 36%|███▌ | 731/2030 [06:13<09:33, 2.26it/s] 36%|███▌ | 732/2030 [06:14<09:51, 2.20it/s] 36%|███▌ | 733/2030 [06:14<09:28, 2.28it/s] 36%|███▌ | 734/2030 [06:14<09:03, 2.39it/s] 36%|███▌ | 735/2030 [06:15<09:01, 2.39it/s] 36%|███▋ | 736/2030 [06:15<09:52, 2.18it/s] 36%|███▋ | 737/2030 [06:16<09:22, 2.30it/s] 36%|███▋ | 738/2030 [06:16<09:01, 2.39it/s] 36%|███▋ | 739/2030 [06:17<10:37, 2.02it/s] 36%|███▋ | 740/2030 [06:17<09:53, 2.17it/s] 37%|███▋ | 741/2030 [06:18<10:43, 2.00it/s] 37%|███▋ | 742/2030 [06:18<10:12, 2.10it/s] 37%|███▋ | 743/2030 [06:19<09:54, 2.17it/s] 37%|███▋ | 744/2030 [06:19<11:18, 1.89it/s] 37%|███▋ | 745/2030 [06:20<10:36, 2.02it/s] 37%|███▋ | 746/2030 [06:20<10:41, 2.00it/s] 37%|███▋ | 747/2030 [06:21<09:54, 2.16it/s] 37%|███▋ | 748/2030 [06:21<10:14, 2.09it/s] 37%|███▋ | 749/2030 [06:22<10:33, 2.02it/s] 37%|███▋ | 750/2030 [06:22<12:06, 1.76it/s] 37%|███▋ | 751/2030 [06:23<13:29, 1.58it/s] 37%|███▋ | 752/2030 [06:24<11:41, 1.82it/s] 37%|███▋ | 753/2030 [06:24<10:44, 1.98it/s] 37%|███▋ | 754/2030 [06:24<10:23, 2.05it/s] 37%|███▋ | 755/2030 [06:25<10:32, 2.01it/s] 37%|███▋ | 756/2030 [06:25<09:49, 2.16it/s] 37%|███▋ | 757/2030 [06:26<10:56, 1.94it/s] 37%|███▋ | 758/2030 [06:27<12:07, 1.75it/s] 37%|███▋ | 759/2030 [06:27<11:04, 1.91it/s] 37%|███▋ | 760/2030 [06:28<10:38, 1.99it/s] 37%|███▋ | 761/2030 [06:28<10:25, 2.03it/s] 38%|███▊ | 762/2030 [06:29<10:49, 1.95it/s] 38%|███▊ | 763/2030 [06:29<10:38, 1.99it/s] 38%|███▊ | 764/2030 [06:30<13:47, 1.53it/s] 38%|███▊ | 765/2030 [06:30<12:01, 1.75it/s] 38%|███▊ | 766/2030 [06:31<13:41, 1.54it/s] 38%|███▊ | 767/2030 [06:32<12:02, 1.75it/s] 38%|███▊ | 768/2030 [06:32<11:20, 1.85it/s] 38%|███▊ | 769/2030 [06:33<10:40, 1.97it/s] 38%|███▊ | 770/2030 [06:33<11:08, 1.88it/s] 38%|███▊ | 771/2030 [06:34<12:15, 1.71it/s] 38%|███▊ | 772/2030 [06:35<13:18, 1.58it/s] 38%|███▊ | 773/2030 [06:35<12:46, 1.64it/s] 38%|███▊ | 774/2030 [06:36<11:37, 1.80it/s] 38%|███▊ | 775/2030 [06:36<10:36, 1.97it/s] 38%|███▊ | 776/2030 [06:37<10:48, 1.93it/s] 38%|███▊ | 777/2030 [06:37<10:44, 1.94it/s] 38%|███▊ | 778/2030 [06:37<09:53, 2.11it/s] 38%|███▊ | 779/2030 [06:38<09:28, 2.20it/s] 38%|███▊ | 780/2030 [06:38<09:19, 2.24it/s] 38%|███▊ | 781/2030 [06:39<09:01, 2.31it/s] 39%|███▊ | 782/2030 [06:39<08:33, 2.43it/s] 39%|███▊ | 783/2030 [06:39<08:23, 2.48it/s] 39%|███▊ | 784/2030 [06:40<08:30, 2.44it/s] 39%|███▊ | 785/2030 [06:40<08:40, 2.39it/s] 39%|███▊ | 786/2030 [06:41<09:17, 2.23it/s] 39%|███▉ | 787/2030 [06:41<09:55, 2.09it/s] 39%|███▉ | 788/2030 [06:42<09:34, 2.16it/s] 39%|███▉ | 789/2030 [06:42<08:42, 2.37it/s] 39%|███▉ | 790/2030 [06:43<11:16, 1.83it/s] 39%|███▉ | 791/2030 [06:43<10:59, 1.88it/s] 39%|███▉ | 792/2030 [06:44<10:00, 2.06it/s] 39%|███▉ | 793/2030 [06:44<09:41, 2.13it/s] 39%|███▉ | 794/2030 [06:45<09:27, 2.18it/s] 39%|███▉ | 795/2030 [06:45<08:47, 2.34it/s] 39%|███▉ | 796/2030 [06:46<10:28, 1.96it/s] 39%|███▉ | 797/2030 [06:46<09:59, 2.06it/s] 39%|███▉ | 798/2030 [06:46<09:08, 2.25it/s] 39%|███▉ | 799/2030 [06:47<09:00, 2.28it/s] 39%|███▉ | 800/2030 [06:48<10:56, 1.87it/s] 39%|███▉ | 801/2030 [06:48<09:51, 2.08it/s] 40%|███▉ | 802/2030 [06:48<08:58, 2.28it/s] 40%|███▉ | 803/2030 [06:49<08:25, 2.43it/s] 40%|███▉ | 804/2030 [06:49<08:16, 2.47it/s] 40%|███▉ | 805/2030 [06:50<08:21, 2.44it/s] 40%|███▉ | 806/2030 [06:50<08:53, 2.29it/s] 40%|███▉ | 807/2030 [06:50<08:45, 2.33it/s] 40%|███▉ | 808/2030 [06:51<08:34, 2.37it/s] 40%|███▉ | 809/2030 [06:51<08:30, 2.39it/s] 40%|███▉ | 810/2030 [06:52<08:47, 2.31it/s] 40%|███▉ | 811/2030 [06:52<08:38, 2.35it/s] 40%|████ | 812/2030 [06:53<09:45, 2.08it/s] 40%|████ | 813/2030 [06:53<10:56, 1.85it/s] 40%|████ | 814/2030 [06:54<10:10, 1.99it/s][INFO|trainer.py:811] 2024-09-09 12:01:07,104 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 12:01:07,106 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 12:01:07,106 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 12:01:07,106 >> Batch size = 8 {'eval_loss': 0.1995203047990799, 'eval_precision': 0.6322393822393823, 'eval_recall': 0.7170224411603722, 'eval_f1': 0.671967171069505, 'eval_accuracy': 0.9469665372645898, 'eval_runtime': 5.8448, 'eval_samples_per_second': 430.983, 'eval_steps_per_second': 53.894, 'epoch': 3.0} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-814 [INFO|configuration_utils.py:472] 2024-09-09 12:01:12,981 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-814/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:01:14,015 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-814/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:01:14,016 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-814/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:01:14,016 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-814/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:01:17,796 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:01:17,796 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 40%|████ | 815/2030 [07:05<1:14:56, 3.70s/it] 40%|████ | 816/2030 [07:05<55:06, 2.72s/it] 40%|████ | 817/2030 [07:06<41:38, 2.06s/it] 40%|████ | 818/2030 [07:06<31:44, 1.57s/it] 40%|████ | 819/2030 [07:07<24:35, 1.22s/it] 40%|████ | 820/2030 [07:07<19:39, 1.03it/s] 40%|████ | 821/2030 [07:08<16:18, 1.24it/s] 40%|████ | 822/2030 [07:08<14:52, 1.35it/s] 41%|████ | 823/2030 [07:09<12:30, 1.61it/s] 41%|████ | 824/2030 [07:09<11:45, 1.71it/s] 41%|████ | 825/2030 [07:10<12:53, 1.56it/s] 41%|████ | 826/2030 [07:10<11:30, 1.74it/s] 41%|████ | 827/2030 [07:11<11:01, 1.82it/s] 41%|████ | 828/2030 [07:11<10:17, 1.95it/s] 41%|████ | 829/2030 [07:12<10:03, 1.99it/s] 41%|████ | 830/2030 [07:12<10:11, 1.96it/s] 41%|████ | 831/2030 [07:13<09:23, 2.13it/s] 41%|████ | 832/2030 [07:13<09:22, 2.13it/s] 41%|████ | 833/2030 [07:14<11:02, 1.81it/s] 41%|████ | 834/2030 [07:14<09:52, 2.02it/s] 41%|████ | 835/2030 [07:15<09:28, 2.10it/s] 41%|████ | 836/2030 [07:15<09:19, 2.13it/s] 41%|████ | 837/2030 [07:15<09:14, 2.15it/s] 41%|████▏ | 838/2030 [07:16<08:57, 2.22it/s] 41%|████▏ | 839/2030 [07:17<10:58, 1.81it/s] 41%|████▏ | 840/2030 [07:17<10:08, 1.96it/s] 41%|████▏ | 841/2030 [07:17<09:19, 2.12it/s] 41%|████▏ | 842/2030 [07:18<09:02, 2.19it/s] 42%|████▏ | 843/2030 [07:18<08:32, 2.32it/s] 42%|████▏ | 844/2030 [07:19<09:20, 2.12it/s] 42%|████▏ | 845/2030 [07:19<08:56, 2.21it/s] 42%|████▏ | 846/2030 [07:20<08:16, 2.39it/s] 42%|████▏ | 847/2030 [07:20<07:49, 2.52it/s] 42%|████▏ | 848/2030 [07:20<07:54, 2.49it/s] 42%|████▏ | 849/2030 [07:21<08:23, 2.35it/s] 42%|████▏ | 850/2030 [07:21<08:22, 2.35it/s] 42%|████▏ | 851/2030 [07:22<10:33, 1.86it/s] 42%|████▏ | 852/2030 [07:23<10:20, 1.90it/s] 42%|████▏ | 853/2030 [07:23<09:57, 1.97it/s] 42%|████▏ | 854/2030 [07:23<09:37, 2.04it/s] 42%|████▏ | 855/2030 [07:24<10:30, 1.86it/s] 42%|████▏ | 856/2030 [07:24<09:32, 2.05it/s] 42%|████▏ | 857/2030 [07:25<09:10, 2.13it/s] 42%|████▏ | 858/2030 [07:25<08:56, 2.18it/s] 42%|████▏ | 859/2030 [07:26<11:58, 1.63it/s] 42%|████▏ | 860/2030 [07:27<11:06, 1.75it/s] 42%|████▏ | 861/2030 [07:28<13:14, 1.47it/s] 42%|████▏ | 862/2030 [07:29<14:19, 1.36it/s] 43%|████▎ | 863/2030 [07:29<13:02, 1.49it/s] 43%|████▎ | 864/2030 [07:30<12:08, 1.60it/s] 43%|████▎ | 865/2030 [07:30<11:54, 1.63it/s] 43%|████▎ | 866/2030 [07:31<10:29, 1.85it/s] 43%|████▎ | 867/2030 [07:31<09:58, 1.94it/s] 43%|████▎ | 868/2030 [07:31<08:57, 2.16it/s] 43%|████▎ | 869/2030 [07:32<09:20, 2.07it/s] 43%|████▎ | 870/2030 [07:32<08:25, 2.29it/s] 43%|████▎ | 871/2030 [07:33<08:36, 2.24it/s] 43%|████▎ | 872/2030 [07:33<08:29, 2.27it/s] 43%|████▎ | 873/2030 [07:34<09:00, 2.14it/s] 43%|████▎ | 874/2030 [07:34<08:03, 2.39it/s] 43%|████▎ | 875/2030 [07:34<08:07, 2.37it/s] 43%|████▎ | 876/2030 [07:35<09:13, 2.09it/s] 43%|████▎ | 877/2030 [07:35<08:29, 2.26it/s] 43%|████▎ | 878/2030 [07:36<10:24, 1.84it/s] 43%|████▎ | 879/2030 [07:37<09:57, 1.93it/s] 43%|████▎ | 880/2030 [07:37<09:39, 1.98it/s] 43%|████▎ | 881/2030 [07:38<10:34, 1.81it/s] 43%|████▎ | 882/2030 [07:38<09:33, 2.00it/s] 43%|████▎ | 883/2030 [07:38<09:09, 2.09it/s] 44%|████▎ | 884/2030 [07:39<09:41, 1.97it/s] 44%|████▎ | 885/2030 [07:40<11:01, 1.73it/s] 44%|████▎ | 886/2030 [07:40<10:45, 1.77it/s] 44%|████▎ | 887/2030 [07:41<09:52, 1.93it/s] 44%|████▎ | 888/2030 [07:41<09:07, 2.09it/s] 44%|████▍ | 889/2030 [07:42<08:49, 2.16it/s] 44%|████▍ | 890/2030 [07:42<08:48, 2.16it/s] 44%|████▍ | 891/2030 [07:42<08:44, 2.17it/s] 44%|████▍ | 892/2030 [07:43<09:25, 2.01it/s] 44%|████▍ | 893/2030 [07:43<08:56, 2.12it/s] 44%|████▍ | 894/2030 [07:44<09:50, 1.92it/s] 44%|████▍ | 895/2030 [07:45<09:43, 1.95it/s] 44%|████▍ | 896/2030 [07:45<09:28, 2.00it/s] 44%|████▍ | 897/2030 [07:46<09:11, 2.06it/s] 44%|████▍ | 898/2030 [07:46<09:03, 2.08it/s] 44%|████▍ | 899/2030 [07:46<08:44, 2.16it/s] 44%|████▍ | 900/2030 [07:47<08:17, 2.27it/s] 44%|████▍ | 901/2030 [07:47<08:14, 2.28it/s] 44%|████▍ | 902/2030 [07:48<08:14, 2.28it/s] 44%|████▍ | 903/2030 [07:48<08:08, 2.31it/s] 45%|████▍ | 904/2030 [07:49<08:52, 2.11it/s] 45%|████▍ | 905/2030 [07:49<08:51, 2.12it/s] 45%|████▍ | 906/2030 [07:50<08:29, 2.21it/s] 45%|████▍ | 907/2030 [07:50<08:45, 2.14it/s] 45%|████▍ | 908/2030 [07:51<08:37, 2.17it/s] 45%|████▍ | 909/2030 [07:51<08:08, 2.29it/s] 45%|████▍ | 910/2030 [07:51<09:03, 2.06it/s] 45%|████▍ | 911/2030 [07:52<08:40, 2.15it/s] 45%|████▍ | 912/2030 [07:52<08:11, 2.28it/s] 45%|████▍ | 913/2030 [07:53<07:59, 2.33it/s] 45%|████▌ | 914/2030 [07:53<08:04, 2.30it/s] 45%|████▌ | 915/2030 [07:54<08:34, 2.17it/s] 45%|████▌ | 916/2030 [07:54<09:57, 1.86it/s] 45%|████▌ | 917/2030 [07:55<10:55, 1.70it/s] 45%|████▌ | 918/2030 [07:56<10:33, 1.75it/s] 45%|████▌ | 919/2030 [07:56<09:59, 1.85it/s] 45%|████▌ | 920/2030 [07:56<08:52, 2.08it/s] 45%|████▌ | 921/2030 [07:57<08:19, 2.22it/s] 45%|████▌ | 922/2030 [07:57<09:24, 1.96it/s] 45%|████▌ | 923/2030 [07:58<09:10, 2.01it/s] 46%|████▌ | 924/2030 [07:58<08:44, 2.11it/s] 46%|████▌ | 925/2030 [07:59<08:41, 2.12it/s] 46%|████▌ | 926/2030 [07:59<08:12, 2.24it/s] 46%|████▌ | 927/2030 [08:00<08:09, 2.25it/s] 46%|████▌ | 928/2030 [08:00<09:13, 1.99it/s] 46%|████▌ | 929/2030 [08:01<09:04, 2.02it/s] 46%|████▌ | 930/2030 [08:01<08:05, 2.27it/s] 46%|████▌ | 931/2030 [08:01<07:42, 2.38it/s] 46%|████▌ | 932/2030 [08:02<07:16, 2.52it/s] 46%|████▌ | 933/2030 [08:02<07:17, 2.51it/s] 46%|████▌ | 934/2030 [08:03<08:26, 2.16it/s] 46%|████▌ | 935/2030 [08:03<08:32, 2.14it/s] 46%|████▌ | 936/2030 [08:04<08:28, 2.15it/s] 46%|████▌ | 937/2030 [08:04<08:08, 2.24it/s] 46%|████▌ | 938/2030 [08:04<07:45, 2.35it/s] 46%|████▋ | 939/2030 [08:05<09:30, 1.91it/s] 46%|████▋ | 940/2030 [08:06<08:38, 2.10it/s] 46%|████▋ | 941/2030 [08:06<08:34, 2.12it/s] 46%|████▋ | 942/2030 [08:07<08:29, 2.13it/s] 46%|████▋ | 943/2030 [08:07<07:58, 2.27it/s] 47%|████▋ | 944/2030 [08:07<07:29, 2.42it/s] 47%|████▋ | 945/2030 [08:08<08:10, 2.21it/s] 47%|████▋ | 946/2030 [08:08<07:56, 2.27it/s] 47%|████▋ | 947/2030 [08:09<07:56, 2.27it/s] 47%|████▋ | 948/2030 [08:09<07:52, 2.29it/s] 47%|████▋ | 949/2030 [08:10<07:51, 2.29it/s] 47%|████▋ | 950/2030 [08:10<08:04, 2.23it/s] 47%|████▋ | 951/2030 [08:10<07:56, 2.26it/s] 47%|████▋ | 952/2030 [08:11<08:05, 2.22it/s] 47%|████▋ | 953/2030 [08:11<08:07, 2.21it/s] 47%|████▋ | 954/2030 [08:12<07:38, 2.35it/s] 47%|████▋ | 955/2030 [08:12<07:50, 2.28it/s] 47%|████▋ | 956/2030 [08:13<08:10, 2.19it/s] 47%|████▋ | 957/2030 [08:13<08:43, 2.05it/s] 47%|████▋ | 958/2030 [08:14<08:10, 2.19it/s] 47%|████▋ | 959/2030 [08:14<08:39, 2.06it/s] 47%|████▋ | 960/2030 [08:15<08:27, 2.11it/s] 47%|████▋ | 961/2030 [08:15<09:09, 1.95it/s] 47%|████▋ | 962/2030 [08:16<08:54, 2.00it/s] 47%|████▋ | 963/2030 [08:16<08:40, 2.05it/s] 47%|████▋ | 964/2030 [08:17<08:04, 2.20it/s] 48%|████▊ | 965/2030 [08:17<08:00, 2.22it/s] 48%|████▊ | 966/2030 [08:18<08:27, 2.10it/s] 48%|████▊ | 967/2030 [08:18<08:09, 2.17it/s] 48%|████▊ | 968/2030 [08:18<08:24, 2.10it/s] 48%|████▊ | 969/2030 [08:19<08:04, 2.19it/s] 48%|████▊ | 970/2030 [08:19<07:36, 2.32it/s] 48%|████▊ | 971/2030 [08:20<09:12, 1.92it/s] 48%|████▊ | 972/2030 [08:20<08:09, 2.16it/s] 48%|████▊ | 973/2030 [08:21<08:19, 2.12it/s] 48%|████▊ | 974/2030 [08:21<07:49, 2.25it/s] 48%|████▊ | 975/2030 [08:22<09:14, 1.90it/s] 48%|████▊ | 976/2030 [08:22<08:25, 2.08it/s] 48%|████▊ | 977/2030 [08:23<07:58, 2.20it/s] 48%|████▊ | 978/2030 [08:23<08:04, 2.17it/s] 48%|████▊ | 979/2030 [08:24<08:10, 2.14it/s] 48%|████▊ | 980/2030 [08:24<07:58, 2.19it/s] 48%|████▊ | 981/2030 [08:24<07:45, 2.26it/s] 48%|████▊ | 982/2030 [08:25<07:31, 2.32it/s] 48%|████▊ | 983/2030 [08:25<07:58, 2.19it/s] 48%|████▊ | 984/2030 [08:26<08:16, 2.11it/s] 49%|████▊ | 985/2030 [08:27<09:02, 1.93it/s] 49%|████▊ | 986/2030 [08:27<08:06, 2.15it/s] 49%|████▊ | 987/2030 [08:27<07:50, 2.22it/s] 49%|████▊ | 988/2030 [08:28<07:24, 2.34it/s] 49%|████▊ | 989/2030 [08:28<07:14, 2.39it/s] 49%|████▉ | 990/2030 [08:29<07:38, 2.27it/s] 49%|████▉ | 991/2030 [08:29<09:04, 1.91it/s] 49%|████▉ | 992/2030 [08:30<08:59, 1.92it/s] 49%|████▉ | 993/2030 [08:30<08:35, 2.01it/s] 49%|████▉ | 994/2030 [08:31<08:06, 2.13it/s] 49%|████▉ | 995/2030 [08:31<07:52, 2.19it/s] 49%|████▉ | 996/2030 [08:32<08:18, 2.07it/s] 49%|████▉ | 997/2030 [08:32<08:16, 2.08it/s] 49%|████▉ | 998/2030 [08:33<08:14, 2.09it/s] 49%|████▉ | 999/2030 [08:33<08:17, 2.07it/s] 49%|████▉ | 1000/2030 [08:34<09:05, 1.89it/s] 49%|████▉ | 1000/2030 [08:34<09:05, 1.89it/s] 49%|████▉ | 1001/2030 [08:34<08:10, 2.10it/s] 49%|████▉ | 1002/2030 [08:34<08:00, 2.14it/s] 49%|████▉ | 1003/2030 [08:35<07:39, 2.24it/s] 49%|████▉ | 1004/2030 [08:35<07:57, 2.15it/s] 50%|████▉ | 1005/2030 [08:36<07:59, 2.14it/s] 50%|████▉ | 1006/2030 [08:36<07:23, 2.31it/s] 50%|████▉ | 1007/2030 [08:37<07:28, 2.28it/s] 50%|████▉ | 1008/2030 [08:37<06:59, 2.43it/s] 50%|████▉ | 1009/2030 [08:37<07:12, 2.36it/s] 50%|████▉ | 1010/2030 [08:38<07:46, 2.19it/s] 50%|████▉ | 1011/2030 [08:38<07:43, 2.20it/s] 50%|████▉ | 1012/2030 [08:39<07:44, 2.19it/s] 50%|████▉ | 1013/2030 [08:39<07:04, 2.40it/s] 50%|████▉ | 1014/2030 [08:40<06:32, 2.59it/s] 50%|█████ | 1015/2030 [08:40<06:24, 2.64it/s] 50%|█████ | 1016/2030 [08:40<07:14, 2.34it/s] 50%|█████ | 1017/2030 [08:41<07:24, 2.28it/s][INFO|trainer.py:811] 2024-09-09 12:02:54,359 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 12:02:54,362 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 12:02:54,362 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 12:02:54,362 >> Batch size = 8 {'eval_loss': 0.21822449564933777, 'eval_precision': 0.6445872466633712, 'eval_recall': 0.7137383689107827, 'eval_f1': 0.6774025974025973, 'eval_accuracy': 0.9482979883858963, 'eval_runtime': 5.872, 'eval_samples_per_second': 428.988, 'eval_steps_per_second': 53.645, 'epoch': 4.0} {'loss': 0.0248, 'grad_norm': 0.7616795301437378, 'learning_rate': 2.5369458128078822e-05, 'epoch': 4.91} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-1017 [INFO|configuration_utils.py:472] 2024-09-09 12:03:00,211 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-1017/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:03:01,226 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-1017/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:03:01,227 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-1017/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:03:01,227 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-1017/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:03:06,669 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:03:06,670 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 50%|█████ | 1018/2030 [08:54<1:09:19, 4.11s/it] 50%|█████ | 1019/2030 [08:54<50:22, 2.99s/it] 50%|█████ | 1020/2030 [08:55<38:43, 2.30s/it] 50%|█████ | 1021/2030 [08:55<29:04, 1.73s/it] 50%|█████ | 1022/2030 [08:55<22:11, 1.32s/it] 50%|█████ | 1023/2030 [08:56<17:48, 1.06s/it] 50%|█████ | 1024/2030 [08:56<14:35, 1.15it/s] 50%|█████ | 1025/2030 [08:57<12:26, 1.35it/s] 51%|█████ | 1026/2030 [08:57<12:14, 1.37it/s] 51%|█████ | 1027/2030 [08:58<11:09, 1.50it/s] 51%|█████ | 1028/2030 [08:58<09:45, 1.71it/s] 51%|█████ | 1029/2030 [08:59<08:39, 1.93it/s] 51%|█████ | 1030/2030 [08:59<08:13, 2.03it/s] 51%|█████ | 1031/2030 [09:00<07:36, 2.19it/s] 51%|█████ | 1032/2030 [09:00<07:42, 2.16it/s] 51%|█████ | 1033/2030 [09:00<07:22, 2.25it/s] 51%|█████ | 1034/2030 [09:01<07:16, 2.28it/s] 51%|█████ | 1035/2030 [09:01<06:45, 2.45it/s] 51%|█████ | 1036/2030 [09:02<07:09, 2.32it/s] 51%|█████ | 1037/2030 [09:02<07:01, 2.35it/s] 51%|█████ | 1038/2030 [09:02<07:01, 2.36it/s] 51%|█████ | 1039/2030 [09:03<07:01, 2.35it/s] 51%|█████ | 1040/2030 [09:03<07:25, 2.22it/s] 51%|█████▏ | 1041/2030 [09:04<06:56, 2.38it/s] 51%|█████▏ | 1042/2030 [09:04<07:07, 2.31it/s] 51%|█████▏ | 1043/2030 [09:05<07:39, 2.15it/s] 51%|█████▏ | 1044/2030 [09:05<08:31, 1.93it/s] 51%|█████▏ | 1045/2030 [09:06<08:01, 2.04it/s] 52%|█████▏ | 1046/2030 [09:06<08:02, 2.04it/s] 52%|█████▏ | 1047/2030 [09:07<07:29, 2.19it/s] 52%|█████▏ | 1048/2030 [09:07<07:12, 2.27it/s] 52%|█████▏ | 1049/2030 [09:07<06:57, 2.35it/s] 52%|█████▏ | 1050/2030 [09:08<07:29, 2.18it/s] 52%|█████▏ | 1051/2030 [09:08<07:18, 2.23it/s] 52%|█████▏ | 1052/2030 [09:09<07:40, 2.12it/s] 52%|█████▏ | 1053/2030 [09:09<07:47, 2.09it/s] 52%|█████▏ | 1054/2030 [09:10<08:08, 2.00it/s] 52%|█████▏ | 1055/2030 [09:11<09:29, 1.71it/s] 52%|█████▏ | 1056/2030 [09:11<08:53, 1.82it/s] 52%|█████▏ | 1057/2030 [09:12<08:08, 1.99it/s] 52%|█████▏ | 1058/2030 [09:12<07:20, 2.21it/s] 52%|█████▏ | 1059/2030 [09:12<06:51, 2.36it/s] 52%|█████▏ | 1060/2030 [09:13<07:17, 2.22it/s] 52%|█████▏ | 1061/2030 [09:14<10:14, 1.58it/s] 52%|█████▏ | 1062/2030 [09:15<12:14, 1.32it/s] 52%|█████▏ | 1063/2030 [09:15<10:29, 1.54it/s] 52%|█████▏ | 1064/2030 [09:16<09:10, 1.75it/s] 52%|█████▏ | 1065/2030 [09:16<09:02, 1.78it/s] 53%|█████▎ | 1066/2030 [09:17<09:17, 1.73it/s] 53%|█████▎ | 1067/2030 [09:17<08:49, 1.82it/s] 53%|█████▎ | 1068/2030 [09:18<08:43, 1.84it/s] 53%|█████▎ | 1069/2030 [09:18<08:15, 1.94it/s] 53%|█████▎ | 1070/2030 [09:19<07:59, 2.00it/s] 53%|█████▎ | 1071/2030 [09:19<07:19, 2.18it/s] 53%|█████▎ | 1072/2030 [09:20<06:51, 2.33it/s] 53%|█████▎ | 1073/2030 [09:20<07:02, 2.26it/s] 53%|█████▎ | 1074/2030 [09:20<06:52, 2.32it/s] 53%|█████▎ | 1075/2030 [09:21<07:01, 2.27it/s] 53%|█████▎ | 1076/2030 [09:21<06:59, 2.27it/s] 53%|█████▎ | 1077/2030 [09:22<07:14, 2.20it/s] 53%|█████▎ | 1078/2030 [09:22<07:40, 2.07it/s] 53%|█████▎ | 1079/2030 [09:23<07:13, 2.19it/s] 53%|█████▎ | 1080/2030 [09:23<06:52, 2.30it/s] 53%|█████▎ | 1081/2030 [09:24<06:31, 2.42it/s] 53%|█████▎ | 1082/2030 [09:24<06:47, 2.33it/s] 53%|█████▎ | 1083/2030 [09:24<06:26, 2.45it/s] 53%|█████▎ | 1084/2030 [09:25<06:30, 2.42it/s] 53%|█████▎ | 1085/2030 [09:25<06:46, 2.33it/s] 53%|█████▎ | 1086/2030 [09:26<07:32, 2.08it/s] 54%|█████▎ | 1087/2030 [09:26<07:58, 1.97it/s] 54%|█████▎ | 1088/2030 [09:27<07:51, 2.00it/s] 54%|█████▎ | 1089/2030 [09:27<07:42, 2.03it/s] 54%|█████▎ | 1090/2030 [09:28<07:11, 2.18it/s] 54%|█████▎ | 1091/2030 [09:28<07:21, 2.13it/s] 54%|█████▍ | 1092/2030 [09:29<07:13, 2.16it/s] 54%|█████▍ | 1093/2030 [09:29<08:21, 1.87it/s] 54%|█████▍ | 1094/2030 [09:30<08:01, 1.95it/s] 54%|█████▍ | 1095/2030 [09:30<07:27, 2.09it/s] 54%|█████▍ | 1096/2030 [09:31<07:09, 2.18it/s] 54%|█████▍ | 1097/2030 [09:31<07:21, 2.11it/s] 54%|█████▍ | 1098/2030 [09:32<07:03, 2.20it/s] 54%|█████▍ | 1099/2030 [09:32<06:54, 2.25it/s] 54%|█████▍ | 1100/2030 [09:33<08:24, 1.84it/s] 54%|█████▍ | 1101/2030 [09:33<08:08, 1.90it/s] 54%|█████▍ | 1102/2030 [09:34<07:20, 2.11it/s] 54%|█████▍ | 1103/2030 [09:34<07:13, 2.14it/s] 54%|█████▍ | 1104/2030 [09:34<06:49, 2.26it/s] 54%|█████▍ | 1105/2030 [09:35<06:35, 2.34it/s] 54%|█████▍ | 1106/2030 [09:35<06:49, 2.26it/s] 55%|█████▍ | 1107/2030 [09:36<08:54, 1.73it/s] 55%|█████▍ | 1108/2030 [09:37<08:01, 1.91it/s] 55%|█████▍ | 1109/2030 [09:37<07:30, 2.05it/s] 55%|█████▍ | 1110/2030 [09:37<07:03, 2.17it/s] 55%|█████▍ | 1111/2030 [09:38<06:56, 2.20it/s] 55%|█████▍ | 1112/2030 [09:38<06:43, 2.27it/s] 55%|█████▍ | 1113/2030 [09:39<06:58, 2.19it/s] 55%|█████▍ | 1114/2030 [09:39<06:42, 2.28it/s] 55%|█████▍ | 1115/2030 [09:40<07:08, 2.13it/s] 55%|█████▍ | 1116/2030 [09:40<07:11, 2.12it/s] 55%|█████▌ | 1117/2030 [09:41<06:49, 2.23it/s] 55%|█████▌ | 1118/2030 [09:41<06:36, 2.30it/s] 55%|█████▌ | 1119/2030 [09:42<07:02, 2.16it/s] 55%|█████▌ | 1120/2030 [09:42<06:54, 2.20it/s] 55%|█████▌ | 1121/2030 [09:43<07:20, 2.06it/s] 55%|█████▌ | 1122/2030 [09:43<07:35, 2.00it/s] 55%|█████▌ | 1123/2030 [09:43<07:22, 2.05it/s] 55%|█████▌ | 1124/2030 [09:44<07:01, 2.15it/s] 55%|█████▌ | 1125/2030 [09:44<06:42, 2.25it/s] 55%|█████▌ | 1126/2030 [09:45<06:26, 2.34it/s] 56%|█████▌ | 1127/2030 [09:45<06:27, 2.33it/s] 56%|█████▌ | 1128/2030 [09:45<06:08, 2.44it/s] 56%|█████▌ | 1129/2030 [09:46<06:14, 2.41it/s] 56%|█████▌ | 1130/2030 [09:46<06:26, 2.33it/s] 56%|█████▌ | 1131/2030 [09:47<05:52, 2.55it/s] 56%|█████▌ | 1132/2030 [09:47<05:48, 2.58it/s] 56%|█████▌ | 1133/2030 [09:48<06:15, 2.39it/s] 56%|█████▌ | 1134/2030 [09:48<06:03, 2.47it/s] 56%|█████▌ | 1135/2030 [09:48<06:12, 2.41it/s] 56%|█████▌ | 1136/2030 [09:49<06:13, 2.39it/s] 56%|█████▌ | 1137/2030 [09:49<07:03, 2.11it/s] 56%|█████▌ | 1138/2030 [09:50<06:45, 2.20it/s] 56%|█████▌ | 1139/2030 [09:50<07:21, 2.02it/s] 56%|█████▌ | 1140/2030 [09:51<06:42, 2.21it/s] 56%|█████▌ | 1141/2030 [09:51<06:30, 2.27it/s] 56%|█████▋ | 1142/2030 [09:52<06:36, 2.24it/s] 56%|█████▋ | 1143/2030 [09:52<06:07, 2.41it/s] 56%|█████▋ | 1144/2030 [09:52<05:58, 2.47it/s] 56%|█████▋ | 1145/2030 [09:53<06:43, 2.19it/s] 56%|█████▋ | 1146/2030 [09:54<08:05, 1.82it/s] 57%|█████▋ | 1147/2030 [09:54<07:57, 1.85it/s] 57%|█████▋ | 1148/2030 [09:55<07:54, 1.86it/s] 57%|█████▋ | 1149/2030 [09:56<08:59, 1.63it/s] 57%|█████▋ | 1150/2030 [09:56<09:02, 1.62it/s] 57%|█████▋ | 1151/2030 [09:56<07:45, 1.89it/s] 57%|█████▋ | 1152/2030 [09:57<07:13, 2.03it/s] 57%|█████▋ | 1153/2030 [09:57<07:03, 2.07it/s] 57%|█████▋ | 1154/2030 [09:58<06:28, 2.26it/s] 57%|█████▋ | 1155/2030 [09:58<06:20, 2.30it/s] 57%|█████▋ | 1156/2030 [09:59<06:12, 2.35it/s] 57%|█████▋ | 1157/2030 [09:59<06:27, 2.25it/s] 57%|█████▋ | 1158/2030 [10:00<06:44, 2.15it/s] 57%|█████▋ | 1159/2030 [10:00<06:13, 2.33it/s] 57%|█████▋ | 1160/2030 [10:00<05:58, 2.43it/s] 57%|█████▋ | 1161/2030 [10:01<06:03, 2.39it/s] 57%|█████▋ | 1162/2030 [10:01<05:47, 2.50it/s] 57%|█████▋ | 1163/2030 [10:01<06:00, 2.41it/s] 57%|█████▋ | 1164/2030 [10:02<06:17, 2.29it/s] 57%|█████▋ | 1165/2030 [10:02<06:17, 2.29it/s] 57%|█████▋ | 1166/2030 [10:03<06:24, 2.25it/s] 57%|█████▋ | 1167/2030 [10:03<05:59, 2.40it/s] 58%|█████▊ | 1168/2030 [10:04<07:03, 2.04it/s] 58%|█████▊ | 1169/2030 [10:04<06:50, 2.10it/s] 58%|█████▊ | 1170/2030 [10:05<08:30, 1.68it/s] 58%|█████▊ | 1171/2030 [10:06<08:05, 1.77it/s] 58%|█████▊ | 1172/2030 [10:06<07:29, 1.91it/s] 58%|█████▊ | 1173/2030 [10:07<06:58, 2.05it/s] 58%|█████▊ | 1174/2030 [10:07<06:30, 2.19it/s] 58%|█████▊ | 1175/2030 [10:07<06:24, 2.22it/s] 58%|█████▊ | 1176/2030 [10:08<06:11, 2.30it/s] 58%|█████▊ | 1177/2030 [10:08<06:56, 2.05it/s] 58%|█████▊ | 1178/2030 [10:09<07:06, 2.00it/s] 58%|█████▊ | 1179/2030 [10:09<06:39, 2.13it/s] 58%|█████▊ | 1180/2030 [10:10<06:14, 2.27it/s] 58%|█████▊ | 1181/2030 [10:10<06:11, 2.28it/s] 58%|█████▊ | 1182/2030 [10:11<06:31, 2.16it/s] 58%|█████▊ | 1183/2030 [10:11<06:30, 2.17it/s] 58%|█████▊ | 1184/2030 [10:12<06:29, 2.17it/s] 58%|█████▊ | 1185/2030 [10:12<06:29, 2.17it/s] 58%|█████▊ | 1186/2030 [10:12<06:16, 2.24it/s] 58%|█████▊ | 1187/2030 [10:13<06:44, 2.08it/s] 59%|█████▊ | 1188/2030 [10:13<06:23, 2.20it/s] 59%|█████▊ | 1189/2030 [10:14<06:58, 2.01it/s] 59%|█████▊ | 1190/2030 [10:14<06:49, 2.05it/s] 59%|█████▊ | 1191/2030 [10:15<06:23, 2.19it/s] 59%|█████▊ | 1192/2030 [10:15<06:16, 2.22it/s] 59%|█████▉ | 1193/2030 [10:16<06:50, 2.04it/s] 59%|█████▉ | 1194/2030 [10:16<06:36, 2.11it/s] 59%|█████▉ | 1195/2030 [10:17<06:18, 2.21it/s] 59%|█████▉ | 1196/2030 [10:17<06:57, 2.00it/s] 59%|█████▉ | 1197/2030 [10:18<07:00, 1.98it/s] 59%|█████▉ | 1198/2030 [10:18<06:53, 2.01it/s] 59%|█████▉ | 1199/2030 [10:19<06:26, 2.15it/s] 59%|█████▉ | 1200/2030 [10:19<06:14, 2.21it/s] 59%|█████▉ | 1201/2030 [10:20<06:10, 2.24it/s] 59%|█████▉ | 1202/2030 [10:20<06:09, 2.24it/s] 59%|█████▉ | 1203/2030 [10:21<06:38, 2.08it/s] 59%|█████▉ | 1204/2030 [10:21<06:34, 2.09it/s] 59%|█████▉ | 1205/2030 [10:21<06:39, 2.07it/s] 59%|█████▉ | 1206/2030 [10:22<06:01, 2.28it/s] 59%|█████▉ | 1207/2030 [10:22<06:05, 2.25it/s] 60%|█████▉ | 1208/2030 [10:23<05:53, 2.33it/s] 60%|█████▉ | 1209/2030 [10:23<05:37, 2.43it/s] 60%|█████▉ | 1210/2030 [10:24<05:54, 2.31it/s] 60%|█████▉ | 1211/2030 [10:24<05:42, 2.39it/s] 60%|█████▉ | 1212/2030 [10:24<06:12, 2.19it/s] 60%|█████▉ | 1213/2030 [10:25<07:38, 1.78it/s] 60%|█████▉ | 1214/2030 [10:26<06:55, 1.97it/s] 60%|█████▉ | 1215/2030 [10:26<06:25, 2.11it/s] 60%|█████▉ | 1216/2030 [10:27<07:30, 1.81it/s] 60%|█████▉ | 1217/2030 [10:27<06:49, 1.98it/s] 60%|██████ | 1218/2030 [10:28<06:40, 2.03it/s] 60%|██████ | 1219/2030 [10:28<06:32, 2.07it/s] 60%|██████ | 1220/2030 [10:29<07:48, 1.73it/s] 60%|██████ | 1221/2030 [10:29<06:46, 1.99it/s][INFO|trainer.py:811] 2024-09-09 12:04:42,494 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 12:04:42,496 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 12:04:42,496 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 12:04:42,496 >> Batch size = 8 {'eval_loss': 0.24612903594970703, 'eval_precision': 0.6251184834123222, 'eval_recall': 0.7219485495347564, 'eval_f1': 0.6700533401066802, 'eval_accuracy': 0.9448650903140942, 'eval_runtime': 5.8462, 'eval_samples_per_second': 430.877, 'eval_steps_per_second': 53.881, 'epoch': 5.0} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-1221 [INFO|configuration_utils.py:472] 2024-09-09 12:04:48,406 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-1221/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:04:49,434 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-1221/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:04:49,435 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-1221/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:04:49,436 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-1221/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:04:52,502 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:04:52,502 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 60%|██████ | 1222/2030 [10:40<46:42, 3.47s/it] 60%|██████ | 1223/2030 [10:40<34:30, 2.57s/it] 60%|██████ | 1224/2030 [10:40<25:50, 1.92s/it] 60%|██████ | 1225/2030 [10:41<19:41, 1.47s/it] 60%|██████ | 1226/2030 [10:41<15:06, 1.13s/it] 60%|██████ | 1227/2030 [10:42<12:16, 1.09it/s] 60%|██████ | 1228/2030 [10:42<10:36, 1.26it/s] 61%|██████ | 1229/2030 [10:43<09:27, 1.41it/s] 61%|██████ | 1230/2030 [10:43<09:11, 1.45it/s] 61%|██████ | 1231/2030 [10:44<08:45, 1.52it/s] 61%|██████ | 1232/2030 [10:44<07:48, 1.70it/s] 61%|██████ | 1233/2030 [10:45<07:37, 1.74it/s] 61%|██████ | 1234/2030 [10:45<06:54, 1.92it/s] 61%|██████ | 1235/2030 [10:46<06:58, 1.90it/s] 61%|██████ | 1236/2030 [10:46<06:43, 1.97it/s] 61%|██████ | 1237/2030 [10:47<06:19, 2.09it/s] 61%|██████ | 1238/2030 [10:47<06:10, 2.14it/s] 61%|██████ | 1239/2030 [10:48<06:07, 2.15it/s] 61%|██████ | 1240/2030 [10:48<05:47, 2.28it/s] 61%|██████ | 1241/2030 [10:48<05:34, 2.36it/s] 61%|██████ | 1242/2030 [10:49<05:30, 2.38it/s] 61%|██████ | 1243/2030 [10:49<05:15, 2.50it/s] 61%|██████▏ | 1244/2030 [10:49<05:00, 2.62it/s] 61%|██████▏ | 1245/2030 [10:50<05:39, 2.31it/s] 61%|██████▏ | 1246/2030 [10:50<05:23, 2.42it/s] 61%|██████▏ | 1247/2030 [10:51<05:31, 2.36it/s] 61%|██████▏ | 1248/2030 [10:51<05:42, 2.29it/s] 62%|██████▏ | 1249/2030 [10:52<05:58, 2.18it/s] 62%|██████▏ | 1250/2030 [10:52<05:54, 2.20it/s] 62%|██████▏ | 1251/2030 [10:53<05:23, 2.41it/s] 62%|██████▏ | 1252/2030 [10:53<05:29, 2.36it/s] 62%|██████▏ | 1253/2030 [10:53<05:18, 2.44it/s] 62%|██████▏ | 1254/2030 [10:54<05:21, 2.41it/s] 62%|██████▏ | 1255/2030 [10:54<05:08, 2.51it/s] 62%|██████▏ | 1256/2030 [10:54<04:53, 2.64it/s] 62%|██████▏ | 1257/2030 [10:55<05:20, 2.41it/s] 62%|██████▏ | 1258/2030 [10:55<05:12, 2.47it/s] 62%|██████▏ | 1259/2030 [10:56<05:34, 2.31it/s] 62%|██████▏ | 1260/2030 [10:56<05:34, 2.30it/s] 62%|██████▏ | 1261/2030 [10:57<05:52, 2.18it/s] 62%|██████▏ | 1262/2030 [10:57<05:34, 2.30it/s] 62%|██████▏ | 1263/2030 [10:58<05:31, 2.31it/s] 62%|██████▏ | 1264/2030 [10:58<05:32, 2.31it/s] 62%|██████▏ | 1265/2030 [10:59<06:11, 2.06it/s] 62%|██████▏ | 1266/2030 [10:59<06:21, 2.00it/s] 62%|██████▏ | 1267/2030 [11:00<05:57, 2.14it/s] 62%|██████▏ | 1268/2030 [11:00<06:53, 1.84it/s] 63%|██████▎ | 1269/2030 [11:01<06:15, 2.03it/s] 63%|██████▎ | 1270/2030 [11:01<06:52, 1.84it/s] 63%|██████▎ | 1271/2030 [11:02<06:33, 1.93it/s] 63%|██████▎ | 1272/2030 [11:02<06:29, 1.95it/s] 63%|██████▎ | 1273/2030 [11:03<07:00, 1.80it/s] 63%|██████▎ | 1274/2030 [11:03<06:38, 1.89it/s] 63%|██████▎ | 1275/2030 [11:04<07:03, 1.78it/s] 63%|██████▎ | 1276/2030 [11:04<06:25, 1.95it/s] 63%|██████▎ | 1277/2030 [11:05<06:16, 2.00it/s] 63%|██████▎ | 1278/2030 [11:05<06:06, 2.05it/s] 63%|██████▎ | 1279/2030 [11:06<05:57, 2.10it/s] 63%|██████▎ | 1280/2030 [11:06<05:35, 2.23it/s] 63%|██████▎ | 1281/2030 [11:07<05:39, 2.21it/s] 63%|██████▎ | 1282/2030 [11:07<05:58, 2.09it/s] 63%|██████▎ | 1283/2030 [11:08<05:59, 2.08it/s] 63%|██████▎ | 1284/2030 [11:08<05:41, 2.19it/s] 63%|██████▎ | 1285/2030 [11:09<05:25, 2.29it/s] 63%|██████▎ | 1286/2030 [11:09<05:27, 2.27it/s] 63%|██████▎ | 1287/2030 [11:10<05:50, 2.12it/s] 63%|██████▎ | 1288/2030 [11:10<05:39, 2.19it/s] 63%|██████▎ | 1289/2030 [11:10<06:00, 2.06it/s] 64%|██████▎ | 1290/2030 [11:11<05:28, 2.25it/s] 64%|██████▎ | 1291/2030 [11:11<05:15, 2.34it/s] 64%|██████▎ | 1292/2030 [11:12<05:38, 2.18it/s] 64%|██████▎ | 1293/2030 [11:12<05:11, 2.37it/s] 64%|██████▎ | 1294/2030 [11:13<05:12, 2.36it/s] 64%|██████▍ | 1295/2030 [11:13<05:44, 2.13it/s] 64%|██████▍ | 1296/2030 [11:14<05:48, 2.11it/s] 64%|██████▍ | 1297/2030 [11:14<05:26, 2.25it/s] 64%|██████▍ | 1298/2030 [11:15<05:58, 2.04it/s] 64%|██████▍ | 1299/2030 [11:15<06:25, 1.90it/s] 64%|██████▍ | 1300/2030 [11:16<05:56, 2.04it/s] 64%|██████▍ | 1301/2030 [11:16<06:46, 1.79it/s] 64%|██████▍ | 1302/2030 [11:17<06:17, 1.93it/s] 64%|██████▍ | 1303/2030 [11:17<05:43, 2.11it/s] 64%|██████▍ | 1304/2030 [11:18<05:44, 2.10it/s] 64%|██████▍ | 1305/2030 [11:18<05:46, 2.09it/s] 64%|██████▍ | 1306/2030 [11:18<05:23, 2.24it/s] 64%|██████▍ | 1307/2030 [11:19<05:05, 2.36it/s] 64%|██████▍ | 1308/2030 [11:19<05:22, 2.24it/s] 64%|██████▍ | 1309/2030 [11:20<05:12, 2.31it/s] 65%|██████▍ | 1310/2030 [11:20<05:07, 2.34it/s] 65%|██████▍ | 1311/2030 [11:21<05:04, 2.36it/s] 65%|██████▍ | 1312/2030 [11:21<05:58, 2.00it/s] 65%|██████▍ | 1313/2030 [11:22<05:47, 2.06it/s] 65%|██████▍ | 1314/2030 [11:22<05:26, 2.19it/s] 65%|██████▍ | 1315/2030 [11:22<05:13, 2.28it/s] 65%|██████▍ | 1316/2030 [11:23<05:18, 2.24it/s] 65%|██████▍ | 1317/2030 [11:23<05:21, 2.22it/s] 65%|██████▍ | 1318/2030 [11:24<04:57, 2.39it/s] 65%|██████▍ | 1319/2030 [11:24<04:58, 2.38it/s] 65%|██████▌ | 1320/2030 [11:25<04:57, 2.38it/s] 65%|██████▌ | 1321/2030 [11:25<05:18, 2.23it/s] 65%|██████▌ | 1322/2030 [11:25<05:05, 2.32it/s] 65%|██████▌ | 1323/2030 [11:26<05:33, 2.12it/s] 65%|██████▌ | 1324/2030 [11:27<06:41, 1.76it/s] 65%|██████▌ | 1325/2030 [11:27<06:31, 1.80it/s] 65%|██████▌ | 1326/2030 [11:28<05:57, 1.97it/s] 65%|██████▌ | 1327/2030 [11:28<05:52, 1.99it/s] 65%|██████▌ | 1328/2030 [11:29<05:34, 2.10it/s] 65%|██████▌ | 1329/2030 [11:29<05:44, 2.03it/s] 66%|██████▌ | 1330/2030 [11:30<06:04, 1.92it/s] 66%|██████▌ | 1331/2030 [11:31<06:54, 1.69it/s] 66%|██████▌ | 1332/2030 [11:31<06:18, 1.84it/s] 66%|██████▌ | 1333/2030 [11:31<06:01, 1.93it/s] 66%|██████▌ | 1334/2030 [11:32<05:35, 2.08it/s] 66%|██████▌ | 1335/2030 [11:32<05:16, 2.20it/s] 66%|██████▌ | 1336/2030 [11:33<05:03, 2.29it/s] 66%|██████▌ | 1337/2030 [11:33<04:49, 2.39it/s] 66%|██████▌ | 1338/2030 [11:33<04:51, 2.37it/s] 66%|██████▌ | 1339/2030 [11:34<04:45, 2.42it/s] 66%|██████▌ | 1340/2030 [11:34<04:52, 2.36it/s] 66%|██████▌ | 1341/2030 [11:35<05:53, 1.95it/s] 66%|██████▌ | 1342/2030 [11:35<05:47, 1.98it/s] 66%|██████▌ | 1343/2030 [11:36<05:19, 2.15it/s] 66%|██████▌ | 1344/2030 [11:36<05:37, 2.03it/s] 66%|██████▋ | 1345/2030 [11:37<05:32, 2.06it/s] 66%|██████▋ | 1346/2030 [11:37<05:13, 2.18it/s] 66%|██████▋ | 1347/2030 [11:38<04:55, 2.31it/s] 66%|██████▋ | 1348/2030 [11:38<04:58, 2.29it/s] 66%|██████▋ | 1349/2030 [11:38<04:55, 2.30it/s] 67%|██████▋ | 1350/2030 [11:39<05:00, 2.26it/s] 67%|██████▋ | 1351/2030 [11:39<04:56, 2.29it/s] 67%|██████▋ | 1352/2030 [11:40<05:05, 2.22it/s] 67%|██████▋ | 1353/2030 [11:40<05:06, 2.21it/s] 67%|██████▋ | 1354/2030 [11:41<05:55, 1.90it/s] 67%|██████▋ | 1355/2030 [11:42<06:26, 1.75it/s] 67%|██████▋ | 1356/2030 [11:42<05:54, 1.90it/s] 67%|██████▋ | 1357/2030 [11:43<05:35, 2.01it/s] 67%|██████▋ | 1358/2030 [11:43<05:16, 2.12it/s] 67%|██████▋ | 1359/2030 [11:43<05:32, 2.02it/s] 67%|██████▋ | 1360/2030 [11:44<05:28, 2.04it/s] 67%|██████▋ | 1361/2030 [11:45<05:42, 1.96it/s] 67%|██████▋ | 1362/2030 [11:45<05:17, 2.10it/s] 67%|██████▋ | 1363/2030 [11:45<05:08, 2.16it/s] 67%|██████▋ | 1364/2030 [11:46<05:10, 2.14it/s] 67%|██████▋ | 1365/2030 [11:47<06:07, 1.81it/s] 67%|██████▋ | 1366/2030 [11:47<05:39, 1.95it/s] 67%|██████▋ | 1367/2030 [11:48<05:59, 1.84it/s] 67%|██████▋ | 1368/2030 [11:49<07:37, 1.45it/s] 67%|██████▋ | 1369/2030 [11:49<06:22, 1.73it/s] 67%|██████▋ | 1370/2030 [11:49<05:55, 1.85it/s] 68%|██████▊ | 1371/2030 [11:50<05:42, 1.92it/s] 68%|██████▊ | 1372/2030 [11:50<05:22, 2.04it/s] 68%|██████▊ | 1373/2030 [11:51<04:57, 2.21it/s] 68%|██████▊ | 1374/2030 [11:51<04:44, 2.31it/s] 68%|██████▊ | 1375/2030 [11:52<04:58, 2.19it/s] 68%|██████▊ | 1376/2030 [11:52<05:04, 2.15it/s] 68%|██████▊ | 1377/2030 [11:53<05:08, 2.12it/s] 68%|██████▊ | 1378/2030 [11:53<04:50, 2.24it/s] 68%|██████▊ | 1379/2030 [11:53<05:10, 2.09it/s] 68%|██████▊ | 1380/2030 [11:54<04:52, 2.22it/s] 68%|██████▊ | 1381/2030 [11:54<04:47, 2.25it/s] 68%|██████▊ | 1382/2030 [11:55<04:50, 2.23it/s] 68%|██████▊ | 1383/2030 [11:55<04:58, 2.17it/s] 68%|██████▊ | 1384/2030 [11:56<04:49, 2.23it/s] 68%|██████▊ | 1385/2030 [11:56<04:43, 2.27it/s] 68%|██████▊ | 1386/2030 [11:57<05:01, 2.13it/s] 68%|██████▊ | 1387/2030 [11:57<04:57, 2.16it/s] 68%|██████▊ | 1388/2030 [11:58<04:54, 2.18it/s] 68%|██████▊ | 1389/2030 [11:58<04:57, 2.15it/s] 68%|██████▊ | 1390/2030 [11:58<04:33, 2.34it/s] 69%|██████▊ | 1391/2030 [11:59<04:30, 2.37it/s] 69%|██████▊ | 1392/2030 [11:59<04:24, 2.42it/s] 69%|██████▊ | 1393/2030 [12:00<04:18, 2.47it/s] 69%|██████▊ | 1394/2030 [12:00<04:20, 2.44it/s] 69%|██████▊ | 1395/2030 [12:00<04:24, 2.40it/s] 69%|██████▉ | 1396/2030 [12:01<05:07, 2.06it/s] 69%|██████▉ | 1397/2030 [12:02<05:16, 2.00it/s] 69%|██████▉ | 1398/2030 [12:02<05:08, 2.05it/s] 69%|██████▉ | 1399/2030 [12:03<05:20, 1.97it/s] 69%|██████▉ | 1400/2030 [12:03<05:05, 2.06it/s] 69%|██████▉ | 1401/2030 [12:04<06:02, 1.73it/s] 69%|██████▉ | 1402/2030 [12:04<05:57, 1.76it/s] 69%|██████▉ | 1403/2030 [12:05<05:56, 1.76it/s] 69%|██████▉ | 1404/2030 [12:05<05:24, 1.93it/s] 69%|██████▉ | 1405/2030 [12:06<04:57, 2.10it/s] 69%|██████▉ | 1406/2030 [12:06<04:57, 2.10it/s] 69%|██████▉ | 1407/2030 [12:07<04:57, 2.09it/s] 69%|██████▉ | 1408/2030 [12:07<04:51, 2.14it/s] 69%|██████▉ | 1409/2030 [12:08<05:42, 1.82it/s] 69%|██████▉ | 1410/2030 [12:08<05:03, 2.04it/s] 70%|██████▉ | 1411/2030 [12:09<04:57, 2.08it/s] 70%|██████▉ | 1412/2030 [12:09<04:44, 2.17it/s] 70%|██████▉ | 1413/2030 [12:10<04:39, 2.21it/s] 70%|██████▉ | 1414/2030 [12:10<04:32, 2.26it/s] 70%|██████▉ | 1415/2030 [12:10<04:26, 2.30it/s] 70%|██████▉ | 1416/2030 [12:11<04:09, 2.46it/s] 70%|██████▉ | 1417/2030 [12:11<04:20, 2.35it/s] 70%|██████▉ | 1418/2030 [12:12<04:12, 2.42it/s] 70%|██████▉ | 1419/2030 [12:12<04:07, 2.47it/s] 70%|██████▉ | 1420/2030 [12:13<04:42, 2.16it/s] 70%|███████ | 1421/2030 [12:13<04:41, 2.16it/s] 70%|███████ | 1422/2030 [12:14<05:33, 1.83it/s] 70%|███████ | 1423/2030 [12:14<05:08, 1.97it/s] 70%|███████ | 1424/2030 [12:15<04:43, 2.13it/s][INFO|trainer.py:811] 2024-09-09 12:06:27,952 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 12:06:27,954 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 12:06:27,954 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 12:06:27,954 >> Batch size = 8 {'eval_loss': 0.26953065395355225, 'eval_precision': 0.6410379625180201, 'eval_recall': 0.7301587301587301, 'eval_f1': 0.6827021494370521, 'eval_accuracy': 0.9469023709454907, 'eval_runtime': 5.9067, 'eval_samples_per_second': 426.468, 'eval_steps_per_second': 53.33, 'epoch': 6.0} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-1424 [INFO|configuration_utils.py:472] 2024-09-09 12:06:33,814 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-1424/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:06:34,834 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-1424/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:06:34,835 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-1424/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:06:34,835 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-1424/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:06:39,873 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:06:39,873 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 70%|███████ | 1425/2030 [12:27<40:50, 4.05s/it] 70%|███████ | 1426/2030 [12:27<29:45, 2.96s/it] 70%|███████ | 1427/2030 [12:28<23:04, 2.30s/it] 70%|███████ | 1428/2030 [12:29<17:30, 1.74s/it] 70%|███████ | 1429/2030 [12:29<14:00, 1.40s/it] 70%|███████ | 1430/2030 [12:30<10:59, 1.10s/it] 70%|███████ | 1431/2030 [12:30<09:11, 1.09it/s] 71%|███████ | 1432/2030 [12:30<07:32, 1.32it/s] 71%|███████ | 1433/2030 [12:31<06:46, 1.47it/s] 71%|███████ | 1434/2030 [12:31<05:56, 1.67it/s] 71%|███████ | 1435/2030 [12:32<05:45, 1.72it/s] 71%|███████ | 1436/2030 [12:32<05:28, 1.81it/s] 71%|███████ | 1437/2030 [12:33<05:23, 1.84it/s] 71%|███████ | 1438/2030 [12:34<05:48, 1.70it/s] 71%|███████ | 1439/2030 [12:34<05:12, 1.89it/s] 71%|███████ | 1440/2030 [12:34<05:01, 1.96it/s] 71%|███████ | 1441/2030 [12:35<04:48, 2.04it/s] 71%|███████ | 1442/2030 [12:35<04:41, 2.09it/s] 71%|███████ | 1443/2030 [12:36<04:43, 2.07it/s] 71%|███████ | 1444/2030 [12:36<04:30, 2.16it/s] 71%|███████ | 1445/2030 [12:37<04:59, 1.95it/s] 71%|███████ | 1446/2030 [12:37<04:51, 2.00it/s] 71%|███████▏ | 1447/2030 [12:38<05:21, 1.81it/s] 71%|███████▏ | 1448/2030 [12:38<04:46, 2.03it/s] 71%|███████▏ | 1449/2030 [12:39<04:53, 1.98it/s] 71%|███████▏ | 1450/2030 [12:39<04:34, 2.11it/s] 71%|███████▏ | 1451/2030 [12:40<04:13, 2.28it/s] 72%|███████▏ | 1452/2030 [12:40<04:30, 2.14it/s] 72%|███████▏ | 1453/2030 [12:41<04:35, 2.10it/s] 72%|███████▏ | 1454/2030 [12:42<05:37, 1.71it/s] 72%|███████▏ | 1455/2030 [12:42<05:10, 1.85it/s] 72%|███████▏ | 1456/2030 [12:42<04:45, 2.01it/s] 72%|███████▏ | 1457/2030 [12:43<04:34, 2.09it/s] 72%|███████▏ | 1458/2030 [12:43<04:38, 2.05it/s] 72%|███████▏ | 1459/2030 [12:44<04:42, 2.02it/s] 72%|███████▏ | 1460/2030 [12:44<04:48, 1.97it/s] 72%|███████▏ | 1461/2030 [12:45<04:44, 2.00it/s] 72%|███████▏ | 1462/2030 [12:45<04:29, 2.11it/s] 72%|███████▏ | 1463/2030 [12:46<04:14, 2.23it/s] 72%|███████▏ | 1464/2030 [12:46<04:17, 2.20it/s] 72%|███████▏ | 1465/2030 [12:47<04:51, 1.94it/s] 72%|███████▏ | 1466/2030 [12:47<04:56, 1.90it/s] 72%|███████▏ | 1467/2030 [12:48<04:42, 2.00it/s] 72%|███████▏ | 1468/2030 [12:48<04:35, 2.04it/s] 72%|███████▏ | 1469/2030 [12:49<04:22, 2.14it/s] 72%|███████▏ | 1470/2030 [12:49<04:02, 2.31it/s] 72%|███████▏ | 1471/2030 [12:49<04:06, 2.27it/s] 73%|███████▎ | 1472/2030 [12:50<03:49, 2.43it/s] 73%|███████▎ | 1473/2030 [12:50<04:22, 2.12it/s] 73%|███████▎ | 1474/2030 [12:51<04:24, 2.11it/s] 73%|███████▎ | 1475/2030 [12:51<04:24, 2.10it/s] 73%|███████▎ | 1476/2030 [12:52<04:14, 2.18it/s] 73%|███████▎ | 1477/2030 [12:52<03:51, 2.39it/s] 73%|███████▎ | 1478/2030 [12:53<04:06, 2.24it/s] 73%|███████▎ | 1479/2030 [12:53<04:10, 2.20it/s] 73%|███████▎ | 1480/2030 [12:54<04:35, 1.99it/s] 73%|███████▎ | 1481/2030 [12:54<04:49, 1.90it/s] 73%|███████▎ | 1482/2030 [12:55<04:30, 2.03it/s] 73%|███████▎ | 1483/2030 [12:55<04:36, 1.98it/s] 73%|███████▎ | 1484/2030 [12:56<04:37, 1.97it/s] 73%|███████▎ | 1485/2030 [12:56<04:58, 1.82it/s] 73%|███████▎ | 1486/2030 [12:57<04:28, 2.03it/s] 73%|███████▎ | 1487/2030 [12:57<04:21, 2.07it/s] 73%|███████▎ | 1488/2030 [12:58<05:11, 1.74it/s] 73%|███████▎ | 1489/2030 [12:58<04:53, 1.84it/s] 73%|███████▎ | 1490/2030 [12:59<04:25, 2.03it/s] 73%|███████▎ | 1491/2030 [12:59<04:09, 2.16it/s] 73%|███████▎ | 1492/2030 [13:00<04:18, 2.08it/s] 74%|███████▎ | 1493/2030 [13:00<04:04, 2.19it/s] 74%|███████▎ | 1494/2030 [13:01<03:57, 2.26it/s] 74%|███████▎ | 1495/2030 [13:01<04:05, 2.18it/s] 74%|███████▎ | 1496/2030 [13:02<04:14, 2.09it/s] 74%|███████▎ | 1497/2030 [13:02<03:59, 2.22it/s] 74%|███████▍ | 1498/2030 [13:02<03:55, 2.26it/s] 74%|███████▍ | 1499/2030 [13:03<03:46, 2.35it/s] 74%|███████▍ | 1500/2030 [13:03<03:47, 2.33it/s] 74%|███████▍ | 1500/2030 [13:03<03:47, 2.33it/s] 74%|███████▍ | 1501/2030 [13:04<03:41, 2.39it/s] 74%|███████▍ | 1502/2030 [13:04<04:17, 2.05it/s] 74%|███████▍ | 1503/2030 [13:05<04:08, 2.12it/s] 74%|███████▍ | 1504/2030 [13:05<03:46, 2.32it/s] 74%|███████▍ | 1505/2030 [13:05<03:37, 2.41it/s] 74%|███████▍ | 1506/2030 [13:06<03:40, 2.38it/s] 74%|███████▍ | 1507/2030 [13:06<03:42, 2.35it/s] 74%|███████▍ | 1508/2030 [13:07<03:45, 2.32it/s] 74%|███████▍ | 1509/2030 [13:07<04:28, 1.94it/s] 74%|███████▍ | 1510/2030 [13:08<04:13, 2.05it/s] 74%|███████▍ | 1511/2030 [13:08<03:49, 2.26it/s] 74%|███████▍ | 1512/2030 [13:09<04:02, 2.14it/s] 75%|███████▍ | 1513/2030 [13:09<04:03, 2.13it/s] 75%|███████▍ | 1514/2030 [13:10<04:08, 2.08it/s] 75%|███████▍ | 1515/2030 [13:10<04:03, 2.12it/s] 75%|███████▍ | 1516/2030 [13:11<03:47, 2.26it/s] 75%|███████▍ | 1517/2030 [13:11<03:45, 2.27it/s] 75%|███████▍ | 1518/2030 [13:11<03:26, 2.48it/s] 75%|███████▍ | 1519/2030 [13:12<03:32, 2.41it/s] 75%|███████▍ | 1520/2030 [13:12<04:09, 2.04it/s] 75%|███████▍ | 1521/2030 [13:13<03:46, 2.24it/s] 75%|███████▍ | 1522/2030 [13:13<03:45, 2.26it/s] 75%|███████▌ | 1523/2030 [13:14<03:45, 2.24it/s] 75%|███████▌ | 1524/2030 [13:14<03:48, 2.22it/s] 75%|███████▌ | 1525/2030 [13:14<03:42, 2.27it/s] 75%|███████▌ | 1526/2030 [13:15<03:43, 2.26it/s] 75%|███████▌ | 1527/2030 [13:15<03:37, 2.31it/s] 75%|███████▌ | 1528/2030 [13:16<03:37, 2.31it/s] 75%|███████▌ | 1529/2030 [13:16<04:01, 2.07it/s] 75%|███████▌ | 1530/2030 [13:17<03:54, 2.14it/s] 75%|███████▌ | 1531/2030 [13:18<04:26, 1.87it/s] 75%|███████▌ | 1532/2030 [13:18<04:14, 1.95it/s] 76%|███████▌ | 1533/2030 [13:19<04:21, 1.90it/s] 76%|███████▌ | 1534/2030 [13:19<04:02, 2.05it/s] 76%|███████▌ | 1535/2030 [13:19<03:39, 2.25it/s] 76%|███████▌ | 1536/2030 [13:20<03:28, 2.37it/s] 76%|███████▌ | 1537/2030 [13:20<03:40, 2.24it/s] 76%|███████▌ | 1538/2030 [13:21<03:29, 2.35it/s] 76%|███████▌ | 1539/2030 [13:21<03:12, 2.55it/s] 76%|███████▌ | 1540/2030 [13:22<03:57, 2.06it/s] 76%|███████▌ | 1541/2030 [13:22<03:32, 2.30it/s] 76%|███████▌ | 1542/2030 [13:22<03:19, 2.45it/s] 76%|███████▌ | 1543/2030 [13:23<03:14, 2.50it/s] 76%|███████▌ | 1544/2030 [13:23<03:14, 2.51it/s] 76%|███████▌ | 1545/2030 [13:23<03:27, 2.34it/s] 76%|███████▌ | 1546/2030 [13:24<03:37, 2.22it/s] 76%|███████▌ | 1547/2030 [13:25<04:18, 1.87it/s] 76%|███████▋ | 1548/2030 [13:25<04:12, 1.91it/s] 76%|███████▋ | 1549/2030 [13:26<03:58, 2.02it/s] 76%|███████▋ | 1550/2030 [13:26<03:39, 2.18it/s] 76%|███████▋ | 1551/2030 [13:26<03:38, 2.19it/s] 76%|███████▋ | 1552/2030 [13:27<03:28, 2.30it/s] 77%|███████▋ | 1553/2030 [13:27<03:18, 2.41it/s] 77%|███████▋ | 1554/2030 [13:28<03:10, 2.49it/s] 77%|███████▋ | 1555/2030 [13:28<02:59, 2.65it/s] 77%|███████▋ | 1556/2030 [13:28<03:02, 2.60it/s] 77%|███████▋ | 1557/2030 [13:29<03:10, 2.49it/s] 77%|███████▋ | 1558/2030 [13:29<03:07, 2.51it/s] 77%|███████▋ | 1559/2030 [13:30<03:20, 2.34it/s] 77%|███████▋ | 1560/2030 [13:30<03:24, 2.30it/s] 77%|███████▋ | 1561/2030 [13:31<03:22, 2.32it/s] 77%|███████▋ | 1562/2030 [13:31<03:24, 2.28it/s] 77%|███████▋ | 1563/2030 [13:31<03:21, 2.32it/s] 77%|███████▋ | 1564/2030 [13:32<04:00, 1.94it/s] 77%|███████▋ | 1565/2030 [13:33<04:04, 1.90it/s] 77%|███████▋ | 1566/2030 [13:33<04:39, 1.66it/s] 77%|███████▋ | 1567/2030 [13:34<04:22, 1.76it/s] 77%|███████▋ | 1568/2030 [13:34<04:03, 1.89it/s] 77%|███████▋ | 1569/2030 [13:35<04:35, 1.68it/s] 77%|███████▋ | 1570/2030 [13:36<04:37, 1.65it/s] 77%|███████▋ | 1571/2030 [13:36<04:13, 1.81it/s] 77%|███████▋ | 1572/2030 [13:37<03:53, 1.96it/s] 77%|███████▋ | 1573/2030 [13:37<03:38, 2.09it/s] 78%|███████▊ | 1574/2030 [13:37<03:34, 2.12it/s] 78%|███████▊ | 1575/2030 [13:38<03:23, 2.24it/s] 78%|███████▊ | 1576/2030 [13:38<03:23, 2.23it/s] 78%|███████▊ | 1577/2030 [13:39<03:20, 2.26it/s] 78%|███████▊ | 1578/2030 [13:39<03:10, 2.37it/s] 78%|███████▊ | 1579/2030 [13:39<03:09, 2.38it/s] 78%|███████▊ | 1580/2030 [13:40<03:21, 2.24it/s] 78%|███████▊ | 1581/2030 [13:41<03:35, 2.08it/s] 78%|███████▊ | 1582/2030 [13:41<03:24, 2.19it/s] 78%|███████▊ | 1583/2030 [13:41<03:20, 2.23it/s] 78%|███████▊ | 1584/2030 [13:42<03:16, 2.27it/s] 78%|███████▊ | 1585/2030 [13:42<03:11, 2.32it/s] 78%|███████▊ | 1586/2030 [13:43<03:05, 2.39it/s] 78%|███████▊ | 1587/2030 [13:43<03:02, 2.42it/s] 78%|███████▊ | 1588/2030 [13:44<03:20, 2.20it/s] 78%|███████▊ | 1589/2030 [13:44<03:49, 1.92it/s] 78%|███████▊ | 1590/2030 [13:45<03:27, 2.12it/s] 78%|███████▊ | 1591/2030 [13:45<03:27, 2.12it/s] 78%|███████▊ | 1592/2030 [13:46<03:24, 2.14it/s] 78%|███████▊ | 1593/2030 [13:46<03:22, 2.16it/s] 79%|███████▊ | 1594/2030 [13:47<03:32, 2.05it/s] 79%|███████▊ | 1595/2030 [13:47<03:25, 2.12it/s] 79%|███████▊ | 1596/2030 [13:47<03:08, 2.30it/s] 79%|███████▊ | 1597/2030 [13:48<03:02, 2.38it/s] 79%|███████▊ | 1598/2030 [13:48<03:39, 1.97it/s] 79%|███████▉ | 1599/2030 [13:49<03:39, 1.97it/s] 79%|███████▉ | 1600/2030 [13:49<03:22, 2.12it/s] 79%|███████▉ | 1601/2030 [13:50<03:11, 2.24it/s] 79%|███████▉ | 1602/2030 [13:50<03:34, 1.99it/s] 79%|███████▉ | 1603/2030 [13:51<03:25, 2.08it/s] 79%|███████▉ | 1604/2030 [13:51<03:21, 2.11it/s] 79%|███████▉ | 1605/2030 [13:52<03:13, 2.19it/s] 79%|███████▉ | 1606/2030 [13:52<03:10, 2.22it/s] 79%|███████▉ | 1607/2030 [13:53<03:12, 2.19it/s] 79%|███████▉ | 1608/2030 [13:53<03:13, 2.18it/s] 79%|███████▉ | 1609/2030 [13:53<03:07, 2.24it/s] 79%|███████▉ | 1610/2030 [13:54<02:55, 2.40it/s] 79%|███████▉ | 1611/2030 [13:54<03:18, 2.11it/s] 79%|███████▉ | 1612/2030 [13:55<03:10, 2.19it/s] 79%|███████▉ | 1613/2030 [13:55<03:14, 2.14it/s] 80%|███████▉ | 1614/2030 [13:56<03:06, 2.23it/s] 80%|███████▉ | 1615/2030 [13:56<03:07, 2.22it/s] 80%|███████▉ | 1616/2030 [13:57<03:12, 2.15it/s] 80%|███████▉ | 1617/2030 [13:57<03:03, 2.25it/s] 80%|███████▉ | 1618/2030 [13:58<03:13, 2.13it/s] 80%|███████▉ | 1619/2030 [13:58<03:03, 2.24it/s] 80%|███████▉ | 1620/2030 [13:58<03:03, 2.24it/s] 80%|███████▉ | 1621/2030 [13:59<03:07, 2.18it/s] 80%|███████▉ | 1622/2030 [13:59<03:12, 2.12it/s] 80%|███████▉ | 1623/2030 [14:00<02:55, 2.33it/s] 80%|████████ | 1624/2030 [14:00<02:50, 2.39it/s] 80%|████████ | 1625/2030 [14:01<03:07, 2.17it/s] 80%|████████ | 1626/2030 [14:01<03:19, 2.02it/s] 80%|████████ | 1627/2030 [14:02<04:19, 1.55it/s] 80%|████████ | 1628/2030 [14:03<03:49, 1.75it/s][INFO|trainer.py:811] 2024-09-09 12:08:15,914 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 12:08:15,916 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 12:08:15,916 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 12:08:15,916 >> Batch size = 8 {'eval_loss': 0.2829184830188751, 'eval_precision': 0.6528724440116845, 'eval_recall': 0.7339901477832512, 'eval_f1': 0.6910590054109765, 'eval_accuracy': 0.9469986204241394, 'eval_runtime': 5.8572, 'eval_samples_per_second': 430.069, 'eval_steps_per_second': 53.78, 'epoch': 7.0} {'loss': 0.0081, 'grad_norm': 0.2855200171470642, 'learning_rate': 1.3054187192118228e-05, 'epoch': 7.37} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-1628 [INFO|configuration_utils.py:472] 2024-09-09 12:08:21,812 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-1628/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:08:22,832 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-1628/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:08:22,833 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-1628/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:08:22,833 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-1628/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:08:25,863 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:08:25,864 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 80%|████████ | 1629/2030 [14:13<23:29, 3.51s/it] 80%|████████ | 1630/2030 [14:13<17:01, 2.55s/it] 80%|████████ | 1631/2030 [14:14<12:38, 1.90s/it] 80%|████████ | 1632/2030 [14:14<09:39, 1.46s/it] 80%|████████ | 1633/2030 [14:15<07:35, 1.15s/it] 80%|████████ | 1634/2030 [14:15<06:11, 1.07it/s] 81%|████████ | 1635/2030 [14:15<05:08, 1.28it/s] 81%|████████ | 1636/2030 [14:16<04:20, 1.51it/s] 81%|████████ | 1637/2030 [14:16<04:17, 1.53it/s] 81%|████████ | 1638/2030 [14:17<03:44, 1.74it/s] 81%|████████ | 1639/2030 [14:17<03:43, 1.75it/s] 81%|████████ | 1640/2030 [14:18<03:42, 1.75it/s] 81%|████████ | 1641/2030 [14:18<03:17, 1.97it/s] 81%|████████ | 1642/2030 [14:19<03:02, 2.13it/s] 81%|████████ | 1643/2030 [14:19<03:04, 2.09it/s] 81%|████████ | 1644/2030 [14:20<03:13, 1.99it/s] 81%|████████ | 1645/2030 [14:20<03:04, 2.09it/s] 81%|████████ | 1646/2030 [14:21<02:51, 2.24it/s] 81%|████████ | 1647/2030 [14:21<03:02, 2.10it/s] 81%|████████ | 1648/2030 [14:21<02:50, 2.24it/s] 81%|████████ | 1649/2030 [14:22<03:30, 1.81it/s] 81%|████████▏ | 1650/2030 [14:23<03:16, 1.93it/s] 81%|████████▏ | 1651/2030 [14:23<03:12, 1.97it/s] 81%|████████▏ | 1652/2030 [14:24<02:58, 2.11it/s] 81%|████████▏ | 1653/2030 [14:24<02:48, 2.24it/s] 81%|████████▏ | 1654/2030 [14:24<02:47, 2.25it/s] 82%|████████▏ | 1655/2030 [14:25<03:01, 2.06it/s] 82%|████████▏ | 1656/2030 [14:25<02:45, 2.26it/s] 82%|████████▏ | 1657/2030 [14:26<02:38, 2.36it/s] 82%|████████▏ | 1658/2030 [14:26<02:35, 2.39it/s] 82%|████████▏ | 1659/2030 [14:27<02:33, 2.42it/s] 82%|████████▏ | 1660/2030 [14:27<02:24, 2.56it/s] 82%|████████▏ | 1661/2030 [14:27<02:32, 2.41it/s] 82%|████████▏ | 1662/2030 [14:28<02:43, 2.26it/s] 82%|████████▏ | 1663/2030 [14:28<02:37, 2.33it/s] 82%|████████▏ | 1664/2030 [14:29<02:26, 2.50it/s] 82%|████████▏ | 1665/2030 [14:29<02:36, 2.34it/s] 82%|████████▏ | 1666/2030 [14:29<02:32, 2.39it/s] 82%|████████▏ | 1667/2030 [14:30<02:31, 2.40it/s] 82%|████████▏ | 1668/2030 [14:30<02:23, 2.53it/s] 82%|████████▏ | 1669/2030 [14:31<02:26, 2.46it/s] 82%|████████▏ | 1670/2030 [14:31<02:27, 2.44it/s] 82%|████████▏ | 1671/2030 [14:31<02:29, 2.40it/s] 82%|████████▏ | 1672/2030 [14:32<02:30, 2.38it/s] 82%|████████▏ | 1673/2030 [14:32<02:30, 2.37it/s] 82%|████████▏ | 1674/2030 [14:33<02:31, 2.36it/s] 83%|████████▎ | 1675/2030 [14:33<02:30, 2.35it/s] 83%|████████▎ | 1676/2030 [14:34<02:35, 2.28it/s] 83%|████████▎ | 1677/2030 [14:34<02:47, 2.10it/s] 83%|████████▎ | 1678/2030 [14:35<02:50, 2.06it/s] 83%|████████▎ | 1679/2030 [14:35<02:35, 2.26it/s] 83%|████████▎ | 1680/2030 [14:36<02:37, 2.22it/s] 83%|████████▎ | 1681/2030 [14:36<02:50, 2.05it/s] 83%|████████▎ | 1682/2030 [14:37<02:40, 2.17it/s] 83%|████████▎ | 1683/2030 [14:37<02:39, 2.18it/s] 83%|████████▎ | 1684/2030 [14:37<02:30, 2.30it/s] 83%|████████▎ | 1685/2030 [14:38<02:25, 2.38it/s] 83%|████████▎ | 1686/2030 [14:38<02:24, 2.38it/s] 83%|████████▎ | 1687/2030 [14:39<02:32, 2.25it/s] 83%|████████▎ | 1688/2030 [14:39<02:26, 2.33it/s] 83%|████████▎ | 1689/2030 [14:40<02:32, 2.23it/s] 83%|████████▎ | 1690/2030 [14:40<02:44, 2.07it/s] 83%|████████▎ | 1691/2030 [14:41<02:36, 2.16it/s] 83%|████████▎ | 1692/2030 [14:41<02:33, 2.20it/s] 83%|████████▎ | 1693/2030 [14:41<02:38, 2.13it/s] 83%|████████▎ | 1694/2030 [14:42<02:25, 2.32it/s] 83%|████████▎ | 1695/2030 [14:42<02:11, 2.55it/s] 84%|████████▎ | 1696/2030 [14:43<02:31, 2.20it/s] 84%|████████▎ | 1697/2030 [14:43<02:22, 2.34it/s] 84%|████████▎ | 1698/2030 [14:43<02:14, 2.47it/s] 84%|████████▎ | 1699/2030 [14:44<02:33, 2.15it/s] 84%|████████▎ | 1700/2030 [14:44<02:30, 2.20it/s] 84%|████████▍ | 1701/2030 [14:45<02:29, 2.20it/s] 84%|████████▍ | 1702/2030 [14:45<02:26, 2.24it/s] 84%|████████▍ | 1703/2030 [14:46<02:28, 2.20it/s] 84%|████████▍ | 1704/2030 [14:47<02:57, 1.83it/s] 84%|████████▍ | 1705/2030 [14:47<02:53, 1.87it/s] 84%|████████▍ | 1706/2030 [14:48<02:52, 1.88it/s] 84%|████████▍ | 1707/2030 [14:48<02:46, 1.94it/s] 84%|████████▍ | 1708/2030 [14:49<02:35, 2.07it/s] 84%|████████▍ | 1709/2030 [14:49<02:26, 2.19it/s] 84%|████████▍ | 1710/2030 [14:50<02:44, 1.95it/s] 84%|████████▍ | 1711/2030 [14:50<02:36, 2.04it/s] 84%|████████▍ | 1712/2030 [14:50<02:34, 2.05it/s] 84%|████████▍ | 1713/2030 [14:51<02:46, 1.90it/s] 84%|████████▍ | 1714/2030 [14:51<02:33, 2.06it/s] 84%|████████▍ | 1715/2030 [14:52<02:30, 2.09it/s] 85%|████████▍ | 1716/2030 [14:52<02:18, 2.27it/s] 85%|████████▍ | 1717/2030 [14:53<02:34, 2.02it/s] 85%|████████▍ | 1718/2030 [14:53<02:36, 2.00it/s] 85%|████████▍ | 1719/2030 [14:54<02:30, 2.07it/s] 85%|████████▍ | 1720/2030 [14:54<02:20, 2.21it/s] 85%|████████▍ | 1721/2030 [14:55<02:31, 2.04it/s] 85%|████████▍ | 1722/2030 [14:55<02:24, 2.14it/s] 85%|████████▍ | 1723/2030 [14:56<02:16, 2.24it/s] 85%|████████▍ | 1724/2030 [14:56<02:25, 2.11it/s] 85%|████████▍ | 1725/2030 [14:57<02:33, 1.99it/s] 85%|████████▌ | 1726/2030 [14:57<02:34, 1.97it/s] 85%|████████▌ | 1727/2030 [14:58<03:02, 1.66it/s] 85%|████████▌ | 1728/2030 [14:59<02:55, 1.72it/s] 85%|████████▌ | 1729/2030 [14:59<02:52, 1.74it/s] 85%|████████▌ | 1730/2030 [15:00<02:48, 1.78it/s] 85%|████████▌ | 1731/2030 [15:00<02:40, 1.86it/s] 85%|████████▌ | 1732/2030 [15:01<02:25, 2.05it/s] 85%|████████▌ | 1733/2030 [15:01<02:20, 2.11it/s] 85%|████████▌ | 1734/2030 [15:01<02:19, 2.11it/s] 85%|████████▌ | 1735/2030 [15:02<02:15, 2.18it/s] 86%|████████▌ | 1736/2030 [15:02<02:15, 2.16it/s] 86%|████████▌ | 1737/2030 [15:03<03:09, 1.55it/s] 86%|████████▌ | 1738/2030 [15:04<03:03, 1.59it/s] 86%|████████▌ | 1739/2030 [15:05<03:16, 1.48it/s] 86%|████████▌ | 1740/2030 [15:05<02:56, 1.64it/s] 86%|████████▌ | 1741/2030 [15:06<02:37, 1.84it/s] 86%|████████▌ | 1742/2030 [15:06<02:34, 1.86it/s] 86%|████████▌ | 1743/2030 [15:07<02:17, 2.08it/s] 86%|████████▌ | 1744/2030 [15:07<02:23, 1.99it/s] 86%|████████▌ | 1745/2030 [15:08<02:16, 2.09it/s] 86%|████████▌ | 1746/2030 [15:08<02:05, 2.27it/s] 86%|████████▌ | 1747/2030 [15:08<02:03, 2.30it/s] 86%|████████▌ | 1748/2030 [15:09<02:15, 2.08it/s] 86%|████████▌ | 1749/2030 [15:09<02:13, 2.11it/s] 86%|████████▌ | 1750/2030 [15:10<02:11, 2.13it/s] 86%|████████▋ | 1751/2030 [15:11<02:34, 1.80it/s] 86%|████████▋ | 1752/2030 [15:11<02:20, 1.98it/s] 86%|████████▋ | 1753/2030 [15:12<02:36, 1.77it/s] 86%|████████▋ | 1754/2030 [15:12<02:26, 1.89it/s] 86%|████████▋ | 1755/2030 [15:12<02:14, 2.04it/s] 87%|████████▋ | 1756/2030 [15:13<02:02, 2.23it/s] 87%|████████▋ | 1757/2030 [15:13<01:52, 2.42it/s] 87%|████████▋ | 1758/2030 [15:14<02:16, 1.99it/s] 87%|████████▋ | 1759/2030 [15:14<02:07, 2.13it/s] 87%|████████▋ | 1760/2030 [15:15<02:03, 2.18it/s] 87%|████████▋ | 1761/2030 [15:15<02:17, 1.96it/s] 87%|████████▋ | 1762/2030 [15:16<02:09, 2.07it/s] 87%|████████▋ | 1763/2030 [15:16<02:01, 2.20it/s] 87%|████████▋ | 1764/2030 [15:17<01:57, 2.26it/s] 87%|████████▋ | 1765/2030 [15:17<01:57, 2.25it/s] 87%|████████▋ | 1766/2030 [15:17<01:54, 2.30it/s] 87%|████████▋ | 1767/2030 [15:18<02:00, 2.18it/s] 87%|████████▋ | 1768/2030 [15:18<01:52, 2.33it/s] 87%|████████▋ | 1769/2030 [15:19<01:58, 2.21it/s] 87%|████████▋ | 1770/2030 [15:19<01:55, 2.25it/s] 87%|████████▋ | 1771/2030 [15:20<02:11, 1.97it/s] 87%|████████▋ | 1772/2030 [15:20<02:00, 2.13it/s] 87%|████████▋ | 1773/2030 [15:21<02:00, 2.14it/s] 87%|████████▋ | 1774/2030 [15:21<01:52, 2.28it/s] 87%|████████▋ | 1775/2030 [15:22<01:50, 2.30it/s] 87%|████████▋ | 1776/2030 [15:22<01:55, 2.20it/s] 88%|████████▊ | 1777/2030 [15:22<01:47, 2.35it/s] 88%|████████▊ | 1778/2030 [15:23<01:55, 2.19it/s] 88%|████████▊ | 1779/2030 [15:24<02:06, 1.98it/s] 88%|████████▊ | 1780/2030 [15:24<02:00, 2.08it/s] 88%|████████▊ | 1781/2030 [15:24<01:53, 2.20it/s] 88%|████████▊ | 1782/2030 [15:25<01:52, 2.20it/s] 88%|████████▊ | 1783/2030 [15:25<01:53, 2.18it/s] 88%|████████▊ | 1784/2030 [15:26<01:47, 2.29it/s] 88%|████████▊ | 1785/2030 [15:26<01:45, 2.33it/s] 88%|████████▊ | 1786/2030 [15:27<01:48, 2.24it/s] 88%|████████▊ | 1787/2030 [15:27<01:49, 2.23it/s] 88%|████████▊ | 1788/2030 [15:28<02:13, 1.81it/s] 88%|████████▊ | 1789/2030 [15:28<02:07, 1.89it/s] 88%|████████▊ | 1790/2030 [15:29<01:59, 2.02it/s] 88%|████████▊ | 1791/2030 [15:29<01:53, 2.11it/s] 88%|████████▊ | 1792/2030 [15:30<01:58, 2.01it/s] 88%|████████▊ | 1793/2030 [15:30<01:55, 2.05it/s] 88%|████████▊ | 1794/2030 [15:31<01:47, 2.19it/s] 88%|████████▊ | 1795/2030 [15:31<01:42, 2.28it/s] 88%|████████▊ | 1796/2030 [15:31<01:47, 2.17it/s] 89%|████████▊ | 1797/2030 [15:32<01:44, 2.24it/s] 89%|████████▊ | 1798/2030 [15:32<01:42, 2.26it/s] 89%|████████▊ | 1799/2030 [15:33<01:35, 2.42it/s] 89%|████████▊ | 1800/2030 [15:33<01:31, 2.50it/s] 89%|████████▊ | 1801/2030 [15:34<01:41, 2.25it/s] 89%|████████▉ | 1802/2030 [15:34<01:38, 2.31it/s] 89%|████████▉ | 1803/2030 [15:34<01:37, 2.34it/s] 89%|████████▉ | 1804/2030 [15:35<01:41, 2.23it/s] 89%|████████▉ | 1805/2030 [15:35<01:52, 2.01it/s] 89%|████████▉ | 1806/2030 [15:36<01:50, 2.02it/s] 89%|████████▉ | 1807/2030 [15:36<01:44, 2.12it/s] 89%|████████▉ | 1808/2030 [15:37<01:38, 2.26it/s] 89%|████████▉ | 1809/2030 [15:37<01:34, 2.35it/s] 89%|████████▉ | 1810/2030 [15:38<01:35, 2.30it/s] 89%|████████▉ | 1811/2030 [15:38<01:34, 2.33it/s] 89%|████████▉ | 1812/2030 [15:39<01:49, 1.99it/s] 89%|████████▉ | 1813/2030 [15:39<01:53, 1.92it/s] 89%|████████▉ | 1814/2030 [15:40<01:41, 2.13it/s] 89%|████████▉ | 1815/2030 [15:40<01:36, 2.23it/s] 89%|████████▉ | 1816/2030 [15:40<01:30, 2.35it/s] 90%|████████▉ | 1817/2030 [15:41<01:28, 2.40it/s] 90%|████████▉ | 1818/2030 [15:41<01:30, 2.34it/s] 90%|████████▉ | 1819/2030 [15:42<01:34, 2.23it/s] 90%|████████▉ | 1820/2030 [15:42<01:41, 2.07it/s] 90%|████████▉ | 1821/2030 [15:43<01:36, 2.16it/s] 90%|████████▉ | 1822/2030 [15:43<01:34, 2.20it/s] 90%|████████▉ | 1823/2030 [15:44<01:35, 2.17it/s] 90%|████████▉ | 1824/2030 [15:44<01:35, 2.15it/s] 90%|████████▉ | 1825/2030 [15:44<01:33, 2.20it/s] 90%|████████▉ | 1826/2030 [15:45<01:34, 2.15it/s] 90%|█████████ | 1827/2030 [15:46<01:59, 1.71it/s] 90%|█████████ | 1828/2030 [15:46<01:48, 1.86it/s] 90%|█████████ | 1829/2030 [15:47<01:40, 2.00it/s] 90%|█████████ | 1830/2030 [15:47<01:35, 2.10it/s] 90%|█████████ | 1831/2030 [15:48<01:35, 2.08it/s][INFO|trainer.py:811] 2024-09-09 12:10:01,062 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 12:10:01,064 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 12:10:01,064 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 12:10:01,064 >> Batch size = 8 {'eval_loss': 0.29823970794677734, 'eval_precision': 0.6710997442455243, 'eval_recall': 0.7181171319102354, 'eval_f1': 0.6938127974616606, 'eval_accuracy': 0.9494048573903558, 'eval_runtime': 5.8929, 'eval_samples_per_second': 427.463, 'eval_steps_per_second': 53.454, 'epoch': 8.0} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-1831 [INFO|configuration_utils.py:472] 2024-09-09 12:10:06,933 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-1831/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:10:07,950 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-1831/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:10:07,951 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-1831/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:10:07,952 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-1831/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:10:10,965 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:10:10,966 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 90%|█████████ | 1832/2030 [15:58<11:20, 3.44s/it] 90%|█████████ | 1833/2030 [15:58<08:13, 2.50s/it] 90%|█████████ | 1834/2030 [15:59<06:04, 1.86s/it] 90%|█████████ | 1835/2030 [15:59<04:38, 1.43s/it] 90%|█████████ | 1836/2030 [15:59<03:38, 1.13s/it] 90%|█████████ | 1837/2030 [16:00<03:08, 1.02it/s] 91%|█████████ | 1838/2030 [16:01<02:37, 1.22it/s] 91%|█████████ | 1839/2030 [16:01<02:35, 1.23it/s] 91%|█████████ | 1840/2030 [16:02<02:33, 1.24it/s] 91%|█████████ | 1841/2030 [16:03<02:19, 1.35it/s] 91%|█████████ | 1842/2030 [16:03<02:03, 1.52it/s] 91%|█████████ | 1843/2030 [16:04<01:51, 1.68it/s] 91%|█████████ | 1844/2030 [16:04<01:41, 1.83it/s] 91%|█████████ | 1845/2030 [16:04<01:29, 2.06it/s] 91%|█████████ | 1846/2030 [16:05<01:23, 2.20it/s] 91%|█████████ | 1847/2030 [16:05<01:24, 2.17it/s] 91%|█████████ | 1848/2030 [16:06<01:28, 2.06it/s] 91%|█████████ | 1849/2030 [16:07<01:57, 1.54it/s] 91%|█████████ | 1850/2030 [16:07<01:49, 1.65it/s] 91%|█████████ | 1851/2030 [16:08<01:42, 1.74it/s] 91%|█████████ | 1852/2030 [16:08<01:33, 1.90it/s] 91%|█████████▏| 1853/2030 [16:09<01:34, 1.87it/s] 91%|█████████▏| 1854/2030 [16:09<01:26, 2.03it/s] 91%|█████████▏| 1855/2030 [16:10<01:24, 2.08it/s] 91%|█████████▏| 1856/2030 [16:10<01:17, 2.23it/s] 91%|█████████▏| 1857/2030 [16:10<01:13, 2.37it/s] 92%|█████████▏| 1858/2030 [16:11<01:14, 2.32it/s] 92%|█████████▏| 1859/2030 [16:11<01:11, 2.40it/s] 92%|█████████▏| 1860/2030 [16:12<01:10, 2.39it/s] 92%|█████████▏| 1861/2030 [16:12<01:07, 2.51it/s] 92%|█████████▏| 1862/2030 [16:13<01:11, 2.34it/s] 92%|█████████▏| 1863/2030 [16:13<01:10, 2.36it/s] 92%|█████████▏| 1864/2030 [16:13<01:08, 2.43it/s] 92%|█████████▏| 1865/2030 [16:14<01:11, 2.30it/s] 92%|█████████▏| 1866/2030 [16:14<01:10, 2.33it/s] 92%|█████████▏| 1867/2030 [16:15<01:07, 2.40it/s] 92%|█████████▏| 1868/2030 [16:15<01:06, 2.45it/s] 92%|█████████▏| 1869/2030 [16:16<01:14, 2.15it/s] 92%|█████████▏| 1870/2030 [16:16<01:11, 2.25it/s] 92%|█████████▏| 1871/2030 [16:16<01:07, 2.34it/s] 92%|█████████▏| 1872/2030 [16:17<01:09, 2.27it/s] 92%|█████████▏| 1873/2030 [16:17<01:11, 2.20it/s] 92%|█████████▏| 1874/2030 [16:18<01:06, 2.36it/s] 92%|█████████▏| 1875/2030 [16:18<01:08, 2.26it/s] 92%|█████████▏| 1876/2030 [16:19<01:11, 2.16it/s] 92%|█████████▏| 1877/2030 [16:19<01:19, 1.93it/s] 93%|█████████▎| 1878/2030 [16:20<01:16, 1.99it/s] 93%|█████████▎| 1879/2030 [16:20<01:14, 2.01it/s] 93%|█████████▎| 1880/2030 [16:21<01:21, 1.84it/s] 93%|█████████▎| 1881/2030 [16:21<01:14, 2.01it/s] 93%|█████████▎| 1882/2030 [16:22<01:08, 2.15it/s] 93%|█████████▎| 1883/2030 [16:22<01:05, 2.24it/s] 93%|█████████▎| 1884/2030 [16:23<01:03, 2.29it/s] 93%|█████████▎| 1885/2030 [16:23<01:05, 2.23it/s] 93%|█████████▎| 1886/2030 [16:23<01:03, 2.26it/s] 93%|█████████▎| 1887/2030 [16:24<01:00, 2.37it/s] 93%|█████████▎| 1888/2030 [16:24<00:59, 2.37it/s] 93%|█████████▎| 1889/2030 [16:25<00:58, 2.42it/s] 93%|█████████▎| 1890/2030 [16:25<01:01, 2.29it/s] 93%|█████████▎| 1891/2030 [16:26<01:02, 2.23it/s] 93%|█████████▎| 1892/2030 [16:26<01:00, 2.30it/s] 93%|█████████▎| 1893/2030 [16:26<00:59, 2.32it/s] 93%|█████████▎| 1894/2030 [16:27<00:57, 2.38it/s] 93%|█████████▎| 1895/2030 [16:27<00:56, 2.38it/s] 93%|█████████▎| 1896/2030 [16:28<00:58, 2.29it/s] 93%|█████████▎| 1897/2030 [16:28<00:53, 2.49it/s] 93%|█████████▎| 1898/2030 [16:29<00:57, 2.30it/s] 94%|█████████▎| 1899/2030 [16:29<01:11, 1.84it/s] 94%|█████████▎| 1900/2030 [16:30<01:05, 1.98it/s] 94%|█████████▎| 1901/2030 [16:30<00:59, 2.18it/s] 94%|█████████▎| 1902/2030 [16:31<01:02, 2.06it/s] 94%|█████████▎| 1903/2030 [16:31<01:01, 2.06it/s] 94%|█████████▍| 1904/2030 [16:32<01:00, 2.09it/s] 94%|█████████▍| 1905/2030 [16:32<00:55, 2.24it/s] 94%|█████████▍| 1906/2030 [16:33<01:00, 2.06it/s] 94%|█████████▍| 1907/2030 [16:33<00:55, 2.21it/s] 94%|█████████▍| 1908/2030 [16:34<01:01, 1.97it/s] 94%|█████████▍| 1909/2030 [16:34<00:57, 2.12it/s] 94%|█████████▍| 1910/2030 [16:34<00:52, 2.27it/s] 94%|█████████▍| 1911/2030 [16:35<00:54, 2.17it/s] 94%|█████████▍| 1912/2030 [16:35<00:54, 2.18it/s] 94%|█████████▍| 1913/2030 [16:36<00:54, 2.13it/s] 94%|█████████▍| 1914/2030 [16:36<01:00, 1.92it/s] 94%|█████████▍| 1915/2030 [16:37<01:01, 1.88it/s] 94%|█████████▍| 1916/2030 [16:37<00:55, 2.05it/s] 94%|█████████▍| 1917/2030 [16:38<00:55, 2.05it/s] 94%|█████████▍| 1918/2030 [16:38<00:51, 2.17it/s] 95%|█████████▍| 1919/2030 [16:39<00:49, 2.24it/s] 95%|█████████▍| 1920/2030 [16:39<00:51, 2.14it/s] 95%|█████████▍| 1921/2030 [16:40<00:50, 2.14it/s] 95%|█████████▍| 1922/2030 [16:40<00:47, 2.27it/s] 95%|█████████▍| 1923/2030 [16:40<00:46, 2.30it/s] 95%|█████████▍| 1924/2030 [16:41<00:47, 2.23it/s] 95%|█████████▍| 1925/2030 [16:41<00:44, 2.35it/s] 95%|█████████▍| 1926/2030 [16:42<00:41, 2.48it/s] 95%|█████████▍| 1927/2030 [16:42<00:41, 2.50it/s] 95%|█████████▍| 1928/2030 [16:43<00:46, 2.21it/s] 95%|█████████▌| 1929/2030 [16:43<00:45, 2.22it/s] 95%|█████████▌| 1930/2030 [16:43<00:42, 2.33it/s] 95%|█████████▌| 1931/2030 [16:44<00:44, 2.20it/s] 95%|█████████▌| 1932/2030 [16:44<00:42, 2.29it/s] 95%|█████████▌| 1933/2030 [16:45<00:42, 2.27it/s] 95%|█████████▌| 1934/2030 [16:45<00:50, 1.91it/s] 95%|█████████▌| 1935/2030 [16:46<00:48, 1.96it/s] 95%|█████████▌| 1936/2030 [16:46<00:45, 2.07it/s] 95%|█████████▌| 1937/2030 [16:47<00:42, 2.17it/s] 95%|█████████▌| 1938/2030 [16:47<00:44, 2.05it/s] 96%|█████████▌| 1939/2030 [16:48<00:41, 2.19it/s] 96%|█████████▌| 1940/2030 [16:48<00:37, 2.41it/s] 96%|█████████▌| 1941/2030 [16:49<00:38, 2.30it/s] 96%|█████████▌| 1942/2030 [16:49<00:38, 2.28it/s] 96%|█████████▌| 1943/2030 [16:49<00:36, 2.39it/s] 96%|█████████▌| 1944/2030 [16:50<00:46, 1.86it/s] 96%|█████████▌| 1945/2030 [16:51<00:42, 1.98it/s] 96%|█████████▌| 1946/2030 [16:51<00:38, 2.19it/s] 96%|█████████▌| 1947/2030 [16:51<00:38, 2.17it/s] 96%|█████████▌| 1948/2030 [16:52<00:35, 2.31it/s] 96%|█████████▌| 1949/2030 [16:52<00:31, 2.53it/s] 96%|█████████▌| 1950/2030 [16:53<00:35, 2.26it/s] 96%|█████████▌| 1951/2030 [16:53<00:34, 2.32it/s] 96%|█████████▌| 1952/2030 [16:54<00:38, 2.04it/s] 96%|█████████▌| 1953/2030 [16:54<00:37, 2.05it/s] 96%|█████████▋| 1954/2030 [16:55<00:35, 2.17it/s] 96%|█████████▋| 1955/2030 [16:55<00:31, 2.39it/s] 96%|█████████▋| 1956/2030 [16:56<00:36, 2.05it/s] 96%|█████████▋| 1957/2030 [16:56<00:34, 2.12it/s] 96%|█████████▋| 1958/2030 [16:56<00:31, 2.29it/s] 97%|█████████▋| 1959/2030 [16:57<00:31, 2.26it/s] 97%|█████████▋| 1960/2030 [16:57<00:29, 2.38it/s] 97%|█████████▋| 1961/2030 [16:58<00:28, 2.43it/s] 97%|█████████▋| 1962/2030 [16:58<00:27, 2.48it/s] 97%|█████████▋| 1963/2030 [16:58<00:27, 2.43it/s] 97%|█████████▋| 1964/2030 [16:59<00:30, 2.20it/s] 97%|█████████▋| 1965/2030 [17:00<00:36, 1.79it/s] 97%|█████████▋| 1966/2030 [17:00<00:34, 1.83it/s] 97%|█████████▋| 1967/2030 [17:01<00:33, 1.90it/s] 97%|█████████▋| 1968/2030 [17:01<00:33, 1.82it/s] 97%|█████████▋| 1969/2030 [17:02<00:31, 1.92it/s] 97%|█████████▋| 1970/2030 [17:02<00:30, 1.94it/s] 97%|█████████▋| 1971/2030 [17:03<00:28, 2.05it/s] 97%|█████████▋| 1972/2030 [17:03<00:28, 2.05it/s] 97%|█████████▋| 1973/2030 [17:04<00:27, 2.07it/s] 97%|█████████▋| 1974/2030 [17:04<00:26, 2.12it/s] 97%|█████████▋| 1975/2030 [17:05<00:25, 2.13it/s] 97%|█████████▋| 1976/2030 [17:05<00:23, 2.28it/s] 97%|█████████▋| 1977/2030 [17:05<00:21, 2.42it/s] 97%|█████████▋| 1978/2030 [17:06<00:21, 2.47it/s] 97%|█████████▋| 1979/2030 [17:06<00:21, 2.40it/s] 98%|█████████▊| 1980/2030 [17:07<00:20, 2.41it/s] 98%|█████████▊| 1981/2030 [17:07<00:22, 2.17it/s] 98%|█████████▊| 1982/2030 [17:08<00:22, 2.16it/s] 98%|█████████▊| 1983/2030 [17:08<00:23, 1.98it/s] 98%|█████████▊| 1984/2030 [17:09<00:21, 2.14it/s] 98%|█████████▊| 1985/2030 [17:09<00:25, 1.80it/s] 98%|█████████▊| 1986/2030 [17:10<00:26, 1.66it/s] 98%|█████████▊| 1987/2030 [17:10<00:23, 1.85it/s] 98%|█████████▊| 1988/2030 [17:11<00:22, 1.86it/s] 98%|█████████▊| 1989/2030 [17:12<00:23, 1.72it/s] 98%|█████████▊| 1990/2030 [17:12<00:21, 1.90it/s] 98%|█████████▊| 1991/2030 [17:12<00:19, 2.02it/s] 98%|█████████▊| 1992/2030 [17:13<00:18, 2.02it/s] 98%|█████████▊| 1993/2030 [17:13<00:16, 2.19it/s] 98%|█████████▊| 1994/2030 [17:14<00:16, 2.15it/s] 98%|█████████▊| 1995/2030 [17:14<00:15, 2.28it/s] 98%|█████████▊| 1996/2030 [17:15<00:14, 2.30it/s] 98%|█████████▊| 1997/2030 [17:15<00:15, 2.18it/s] 98%|█████████▊| 1998/2030 [17:15<00:13, 2.35it/s] 98%|█████████▊| 1999/2030 [17:16<00:13, 2.23it/s] 99%|█████████▊| 2000/2030 [17:17<00:14, 2.02it/s] 99%|█████████▊| 2000/2030 [17:17<00:14, 2.02it/s] 99%|█████████▊| 2001/2030 [17:17<00:13, 2.08it/s] 99%|█████████▊| 2002/2030 [17:17<00:13, 2.11it/s] 99%|█████████▊| 2003/2030 [17:18<00:12, 2.22it/s] 99%|█████████▊| 2004/2030 [17:18<00:11, 2.22it/s] 99%|█████████▉| 2005/2030 [17:19<00:12, 2.03it/s] 99%|█████████▉| 2006/2030 [17:20<00:12, 1.86it/s] 99%|█████████▉| 2007/2030 [17:20<00:12, 1.85it/s] 99%|█████████▉| 2008/2030 [17:21<00:11, 1.98it/s] 99%|█████████▉| 2009/2030 [17:21<00:09, 2.10it/s] 99%|█████████▉| 2010/2030 [17:21<00:08, 2.25it/s] 99%|█████████▉| 2011/2030 [17:22<00:08, 2.12it/s] 99%|█████████▉| 2012/2030 [17:22<00:08, 2.05it/s] 99%|█████████▉| 2013/2030 [17:23<00:07, 2.15it/s] 99%|█████████▉| 2014/2030 [17:23<00:07, 2.22it/s] 99%|█████████▉| 2015/2030 [17:24<00:07, 2.01it/s] 99%|█████████▉| 2016/2030 [17:24<00:06, 2.12it/s] 99%|█████████▉| 2017/2030 [17:25<00:07, 1.71it/s] 99%|█████████▉| 2018/2030 [17:25<00:06, 1.96it/s] 99%|█████████▉| 2019/2030 [17:26<00:05, 1.99it/s] 100%|█████████▉| 2020/2030 [17:26<00:04, 2.08it/s] 100%|█████████▉| 2021/2030 [17:27<00:03, 2.27it/s] 100%|█████████▉| 2022/2030 [17:27<00:03, 2.32it/s] 100%|█████████▉| 2023/2030 [17:28<00:03, 2.28it/s] 100%|█████████▉| 2024/2030 [17:28<00:03, 1.98it/s] 100%|█████████▉| 2025/2030 [17:29<00:02, 1.92it/s] 100%|█████████▉| 2026/2030 [17:29<00:02, 1.71it/s] 100%|█████████▉| 2027/2030 [17:30<00:01, 1.91it/s] 100%|█████████▉| 2028/2030 [17:30<00:01, 1.84it/s] 100%|█████████▉| 2029/2030 [17:31<00:00, 1.75it/s] 100%|██████████| 2030/2030 [17:32<00:00, 1.75it/s][INFO|trainer.py:3503] 2024-09-09 12:11:44,921 >> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-2030 [INFO|configuration_utils.py:472] 2024-09-09 12:11:44,922 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-2030/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:11:45,981 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-2030/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:11:45,982 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-2030/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:11:45,982 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-2030/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:11:48,995 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:11:48,995 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json [INFO|trainer.py:811] 2024-09-09 12:11:49,046 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 12:11:49,048 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 12:11:49,048 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 12:11:49,048 >> Batch size = 8 {'eval_loss': 0.30729904770851135, 'eval_precision': 0.6764102564102564, 'eval_recall': 0.7219485495347564, 'eval_f1': 0.6984379136881121, 'eval_accuracy': 0.9500465205813469, 'eval_runtime': 5.8665, 'eval_samples_per_second': 429.386, 'eval_steps_per_second': 53.695, 'epoch': 9.0} {'loss': 0.0038, 'grad_norm': 0.6682894825935364, 'learning_rate': 7.389162561576355e-07, 'epoch': 9.83} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-2030 [INFO|configuration_utils.py:472] 2024-09-09 12:11:54,954 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-2030/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:11:56,378 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-2030/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:11:56,381 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-2030/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:11:56,381 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-2030/special_tokens_map.json [INFO|trainer.py:2394] 2024-09-09 12:11:58,332 >> Training completed. Do not forget to share your model on huggingface.co/models =) [INFO|trainer.py:2632] 2024-09-09 12:11:58,332 >> Loading best model from /content/dissertation/scripts/ner/output/checkpoint-1831 (score: 0.6984379136881121). 100%|██████████| 2030/2030 [17:45<00:00, 1.75it/s] 100%|██████████| 2030/2030 [17:45<00:00, 1.90it/s] [INFO|trainer.py:4283] 2024-09-09 12:11:58,535 >> Waiting for the current checkpoint push to be finished, this might take a couple of minutes. [INFO|trainer.py:3503] 2024-09-09 12:12:20,226 >> Saving model checkpoint to /content/dissertation/scripts/ner/output [INFO|configuration_utils.py:472] 2024-09-09 12:12:20,228 >> Configuration saved in /content/dissertation/scripts/ner/output/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:12:21,587 >> Model weights saved in /content/dissertation/scripts/ner/output/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:12:21,588 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:12:21,588 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json [INFO|trainer.py:3503] 2024-09-09 12:12:21,637 >> Saving model checkpoint to /content/dissertation/scripts/ner/output [INFO|configuration_utils.py:472] 2024-09-09 12:12:21,638 >> Configuration saved in /content/dissertation/scripts/ner/output/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:12:22,908 >> Model weights saved in /content/dissertation/scripts/ner/output/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:12:22,909 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:12:22,909 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json {'eval_loss': 0.3079104423522949, 'eval_precision': 0.6712820512820513, 'eval_recall': 0.7164750957854407, 'eval_f1': 0.6931427058512046, 'eval_accuracy': 0.9500465205813469, 'eval_runtime': 5.9033, 'eval_samples_per_second': 426.708, 'eval_steps_per_second': 53.36, 'epoch': 9.98} {'train_runtime': 1065.756, 'train_samples_per_second': 122.101, 'train_steps_per_second': 1.905, 'train_loss': 0.04138289297302368, 'epoch': 9.98} events.out.tfevents.1725882852.0a1c9bec2a53.9893.0: 0%| | 0.00/11.1k [00:00> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 12:12:29,076 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-09 12:12:29,076 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-09 12:12:29,076 >> Batch size = 8 0%| | 0/315 [00:00> The following columns in the test set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: tokens, ner_tags, id. If tokens, ner_tags, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-09 12:12:35,172 >> ***** Running Prediction ***** [INFO|trainer.py:3821] 2024-09-09 12:12:35,172 >> Num examples = 4047 [INFO|trainer.py:3824] 2024-09-09 12:12:35,172 >> Batch size = 8 0%| | 0/506 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output [INFO|configuration_utils.py:472] 2024-09-09 12:12:45,084 >> Configuration saved in /content/dissertation/scripts/ner/output/config.json [INFO|modeling_utils.py:2799] 2024-09-09 12:12:46,408 >> Model weights saved in /content/dissertation/scripts/ner/output/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-09 12:12:46,409 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-09 12:12:46,410 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json ***** predict metrics ***** predict_accuracy = 0.9467 predict_f1 = 0.6952 predict_loss = 0.3348 predict_precision = 0.6863 predict_recall = 0.7042 predict_runtime = 0:00:09.74 predict_samples_per_second = 415.118 predict_steps_per_second = 51.903 events.out.tfevents.1725883955.0a1c9bec2a53.9893.1: 0%| | 0.00/560 [00:00