Using the `WANDB_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
02/05/2024 22:54:59 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: False, 16-bits training: False
02/05/2024 22:54:59 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_persistent_workers=False,
dataloader_pin_memory=True,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
dispatch_batches=None,
do_eval=False,
do_predict=True,
do_train=False,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=no,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_config=None,
generation_max_length=None,
generation_num_beams=2,
gradient_accumulation_steps=1,
gradient_checkpointing=False,
gradient_checkpointing_kwargs=None,
greater_is_better=None,
group_by_length=True,
half_precision_backend=auto,
hub_always_push=False,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=every_save,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
include_inputs_for_metrics=False,
include_num_input_tokens_seen=False,
include_tokens_per_second=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=input_length,
load_best_model_at_end=False,
local_rank=0,
log_level=passive,
log_level_replica=warning,
log_on_each_node=True,
logging_dir=/beegfs/scratch/user/blee/project_3/models/NLU.mt5-base.task_type-1.fine_tune.gpu_a100-40g+.node-1x1.bsz-64.epochs-22.metric-ema.metric_lang-all/checkpoint-30407/eval/cascaded_SLU/runs/Feb05_22-54-59_chasma-02,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=500,
logging_strategy=steps,
lr_scheduler_kwargs={},
lr_scheduler_type=linear,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
neftune_noise_alpha=None,
no_cuda=False,
num_train_epochs=3.0,
optim=adamw_torch,
optim_args=None,
output_dir=/beegfs/scratch/user/blee/project_3/models/NLU.mt5-base.task_type-1.fine_tune.gpu_a100-40g+.node-1x1.bsz-64.epochs-22.metric-ema.metric_lang-all/checkpoint-30407/eval/cascaded_SLU,
overwrite_output_dir=False,
past_index=-1,
per_device_eval_batch_size=32,
per_device_train_batch_size=8,
predict_with_generate=True,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=[],
resume_from_checkpoint=None,
run_name=/beegfs/scratch/user/blee/project_3/models/NLU.mt5-base.task_type-1.fine_tune.gpu_a100-40g+.node-1x1.bsz-64.epochs-22.metric-ema.metric_lang-all/checkpoint-30407/eval/cascaded_SLU,
save_on_each_node=False,
save_only_model=False,
save_safetensors=True,
save_steps=500,
save_strategy=steps,
save_total_limit=None,
seed=42,
skip_memory_metrics=True,
sortish_sampler=False,
split_batches=False,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_cpu=False,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
)
Loading Dataset Infos from /beegfs/scratch/user/blee/hugging-face/models/modules/datasets_modules/datasets/speech_massive_cascaded/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293
02/05/2024 22:54:59 - INFO - datasets.info - Loading Dataset Infos from /beegfs/scratch/user/blee/hugging-face/models/modules/datasets_modules/datasets/speech_massive_cascaded/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293
Overwrite dataset info from restored data version if exists.
02/05/2024 22:54:59 - INFO - datasets.builder - Overwrite dataset info from restored data version if exists.
Loading Dataset info from /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293
02/05/2024 22:54:59 - INFO - datasets.info - Loading Dataset info from /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293
Found cached dataset speech_massive_cascaded (/beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293)
02/05/2024 22:54:59 - INFO - datasets.builder - Found cached dataset speech_massive_cascaded (/beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293)
Loading Dataset info from /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293
02/05/2024 22:54:59 - INFO - datasets.info - Loading Dataset info from /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293
[INFO|configuration_utils.py:737] 2024-02-05 22:54:59,933 >> loading configuration file /beegfs/scratch/user/blee/project_3/models/NLU.mt5-base.task_type-1.fine_tune.gpu_a100-40g+.node-1x1.bsz-64.epochs-22.metric-ema.metric_lang-all/checkpoint-30407/config.json
[INFO|configuration_utils.py:802] 2024-02-05 22:54:59,942 >> Model config MT5Config {
  "_name_or_path": "/beegfs/scratch/user/blee/project_3/models/NLU.mt5-base.task_type-1.fine_tune.gpu_a100-40g+.node-1x1.bsz-64.epochs-22.metric-ema.metric_lang-all/checkpoint-30407",
  "architectures": [
    "MT5ForConditionalGeneration"
  ],
  "classifier_dropout": 0.0,
  "d_ff": 2048,
  "d_kv": 64,
  "d_model": 768,
  "decoder_start_token_id": 0,
  "dense_act_fn": "gelu_new",
  "dropout": 0.2,
  "dropout_rate": 0.1,
  "eos_token_id": 1,
  "feed_forward_proj": "gated-gelu",
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "is_gated_act": true,
  "layer_norm_epsilon": 1e-06,
  "model_type": "mt5",
  "num_decoder_layers": 12,
  "num_heads": 12,
  "num_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "relative_attention_max_distance": 128,
  "relative_attention_num_buckets": 32,
  "tie_word_embeddings": false,
  "tokenizer_class": "T5Tokenizer",
  "torch_dtype": "float32",
  "transformers_version": "4.37.0.dev0",
  "use_cache": true,
  "vocab_size": 250112
}

[INFO|tokenization_utils_base.py:2024] 2024-02-05 22:54:59,944 >> loading file spiece.model
[INFO|tokenization_utils_base.py:2024] 2024-02-05 22:54:59,945 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2024] 2024-02-05 22:54:59,945 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2024] 2024-02-05 22:54:59,945 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2024] 2024-02-05 22:54:59,945 >> loading file tokenizer_config.json
[INFO|modeling_utils.py:3373] 2024-02-05 22:55:00,407 >> loading weights file /beegfs/scratch/user/blee/project_3/models/NLU.mt5-base.task_type-1.fine_tune.gpu_a100-40g+.node-1x1.bsz-64.epochs-22.metric-ema.metric_lang-all/checkpoint-30407/model.safetensors
[INFO|configuration_utils.py:826] 2024-02-05 22:55:00,566 >> Generate config GenerationConfig {
  "decoder_start_token_id": 0,
  "eos_token_id": 1,
  "pad_token_id": 0
}

[INFO|modeling_utils.py:4224] 2024-02-05 22:55:05,619 >> All model checkpoint weights were used when initializing MT5ForConditionalGeneration.

[INFO|modeling_utils.py:4232] 2024-02-05 22:55:05,620 >> All the weights of MT5ForConditionalGeneration were initialized from the model checkpoint at /beegfs/scratch/user/blee/project_3/models/NLU.mt5-base.task_type-1.fine_tune.gpu_a100-40g+.node-1x1.bsz-64.epochs-22.metric-ema.metric_lang-all/checkpoint-30407.
If your task is similar to the task the model of the checkpoint was trained on, you can already use MT5ForConditionalGeneration for predictions without further training.
[INFO|configuration_utils.py:779] 2024-02-05 22:55:05,627 >> loading configuration file /beegfs/scratch/user/blee/project_3/models/NLU.mt5-base.task_type-1.fine_tune.gpu_a100-40g+.node-1x1.bsz-64.epochs-22.metric-ema.metric_lang-all/checkpoint-30407/generation_config.json
[INFO|configuration_utils.py:826] 2024-02-05 22:55:05,628 >> Generate config GenerationConfig {
  "decoder_start_token_id": 0,
  "eos_token_id": 1,
  "pad_token_id": 0
}

Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-a28263cfb71413f6.arrow
02/05/2024 22:55:05 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-a28263cfb71413f6.arrow
Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 9722.44 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 9509.58 examples/s]
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-0f6b9ba1cc4e5fb1.arrow
02/05/2024 22:55:06 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-0f6b9ba1cc4e5fb1.arrow
Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 19076.56 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 18274.25 examples/s]
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-cd10216b4341f1a2.arrow
02/05/2024 22:55:06 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-cd10216b4341f1a2.arrow
Running tokenizer on prediction dataset:  67%|██████▋   | 2000/2974 [00:00<00:00, 17978.35 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 17438.54 examples/s]
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-cf09cb968cef4c56.arrow
02/05/2024 22:55:06 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-cf09cb968cef4c56.arrow
Running tokenizer on prediction dataset:  67%|██████▋   | 2000/2974 [00:00<00:00, 17167.19 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 16973.43 examples/s]
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-b372c4d6e9ad447f.arrow
02/05/2024 22:55:06 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-b372c4d6e9ad447f.arrow
Running tokenizer on prediction dataset:  67%|██████▋   | 2000/2974 [00:00<00:00, 18156.53 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 9564.04 examples/s] 
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-5e85733ab0d7983c.arrow
02/05/2024 22:55:07 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-5e85733ab0d7983c.arrow
Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 21103.51 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 20206.18 examples/s]
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-3498127f38d0e88c.arrow
02/05/2024 22:55:07 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-3498127f38d0e88c.arrow
Running tokenizer on prediction dataset:  67%|██████▋   | 2000/2974 [00:00<00:00, 17313.98 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 17062.70 examples/s]
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-5031b1ae09c119f0.arrow
02/05/2024 22:55:07 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-5031b1ae09c119f0.arrow
Running tokenizer on prediction dataset:  67%|██████▋   | 2000/2974 [00:00<00:00, 18263.66 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 17891.90 examples/s]
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-afad46b8cc76fbde.arrow
02/05/2024 22:55:07 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-afad46b8cc76fbde.arrow
Running tokenizer on prediction dataset:  67%|██████▋   | 2000/2974 [00:00<00:00, 17239.79 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 16795.72 examples/s]
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-44bbdcb1f95b0505.arrow
02/05/2024 22:55:08 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-44bbdcb1f95b0505.arrow
Running tokenizer on prediction dataset:  34%|███▎      | 1000/2974 [00:00<00:00, 4587.00 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 10501.79 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 9148.27 examples/s] 
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-649defd19aa00c44.arrow
02/05/2024 22:55:08 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-649defd19aa00c44.arrow
Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 20318.81 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 19383.07 examples/s]
Running tokenizer on prediction dataset:   0%|          | 0/2974 [00:00<?, ? examples/s]Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-bc58b34f0f2e6d55.arrow
02/05/2024 22:55:08 - INFO - datasets.arrow_dataset - Caching processed dataset at /beegfs/scratch/user/blee/hugging-face/models/datasets/speech_massive_cascaded/multilingual-test/1.0.0/f36c9e4210ec02a91ee05c9fa785d90aec211ba2025363c65b643c68e109b293/cache-bc58b34f0f2e6d55.arrow
Running tokenizer on prediction dataset:  67%|██████▋   | 2000/2974 [00:00<00:00, 17147.99 examples/s]Running tokenizer on prediction dataset: 100%|██████████| 2974/2974 [00:00<00:00, 16740.65 examples/s]
02/05/2024 22:55:21 - WARNING - accelerate.utils.other - Detected kernel version 4.18.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
02/05/2024 22:55:21 - INFO - __main__ - *** Predict ***
02/05/2024 22:55:21 - INFO - __main__ - *** test_ar_SA ***
[INFO|trainer.py:718] 2024-02-05 22:55:21,494 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 22:55:21,504 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 22:55:21,504 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 22:55:21,505 >>   Batch size = 32
[WARNING|logging.py:314] 2024-02-05 22:55:21,509 >> You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:17,  5.15it/s]  3%|▎         | 3/93 [00:00<00:23,  3.82it/s]  4%|▍         | 4/93 [00:01<00:29,  2.97it/s]  5%|▌         | 5/93 [00:01<00:34,  2.52it/s]  6%|▋         | 6/93 [00:02<00:33,  2.62it/s]  8%|▊         | 7/93 [00:02<00:35,  2.45it/s]  9%|▊         | 8/93 [00:02<00:33,  2.57it/s] 10%|▉         | 9/93 [00:03<00:33,  2.53it/s] 11%|█         | 10/93 [00:03<00:32,  2.58it/s] 12%|█▏        | 11/93 [00:04<00:32,  2.50it/s] 13%|█▎        | 12/93 [00:04<00:31,  2.58it/s] 14%|█▍        | 13/93 [00:04<00:30,  2.62it/s] 15%|█▌        | 14/93 [00:05<00:30,  2.55it/s] 16%|█▌        | 15/93 [00:05<00:32,  2.38it/s] 17%|█▋        | 16/93 [00:06<00:34,  2.20it/s] 18%|█▊        | 17/93 [00:06<00:34,  2.19it/s] 19%|█▉        | 18/93 [00:07<00:32,  2.27it/s] 20%|██        | 19/93 [00:07<00:30,  2.41it/s] 22%|██▏       | 20/93 [00:07<00:29,  2.46it/s] 23%|██▎       | 21/93 [00:08<00:28,  2.51it/s] 24%|██▎       | 22/93 [00:08<00:30,  2.34it/s] 25%|██▍       | 23/93 [00:09<00:29,  2.36it/s] 26%|██▌       | 24/93 [00:09<00:30,  2.28it/s] 27%|██▋       | 25/93 [00:10<00:29,  2.31it/s] 28%|██▊       | 26/93 [00:10<00:29,  2.24it/s] 29%|██▉       | 27/93 [00:10<00:29,  2.25it/s] 30%|███       | 28/93 [00:12<01:00,  1.08it/s] 31%|███       | 29/93 [00:13<00:49,  1.29it/s] 32%|███▏      | 30/93 [00:13<00:41,  1.53it/s] 33%|███▎      | 31/93 [00:14<00:36,  1.68it/s] 34%|███▍      | 32/93 [00:14<00:33,  1.82it/s] 35%|███▌      | 33/93 [00:15<00:30,  1.95it/s] 37%|███▋      | 34/93 [00:15<00:28,  2.09it/s] 38%|███▊      | 35/93 [00:15<00:25,  2.31it/s] 39%|███▊      | 36/93 [00:16<00:24,  2.36it/s] 40%|███▉      | 37/93 [00:16<00:24,  2.30it/s] 41%|████      | 38/93 [00:17<00:25,  2.16it/s] 42%|████▏     | 39/93 [00:17<00:24,  2.18it/s] 43%|████▎     | 40/93 [00:18<00:23,  2.30it/s] 44%|████▍     | 41/93 [00:18<00:22,  2.35it/s] 45%|████▌     | 42/93 [00:18<00:21,  2.40it/s] 46%|████▌     | 43/93 [00:19<00:19,  2.54it/s] 47%|████▋     | 44/93 [00:19<00:18,  2.69it/s] 48%|████▊     | 45/93 [00:19<00:18,  2.60it/s] 49%|████▉     | 46/93 [00:20<00:18,  2.47it/s] 51%|█████     | 47/93 [00:20<00:19,  2.34it/s] 52%|█████▏    | 48/93 [00:21<00:20,  2.23it/s] 53%|█████▎    | 49/93 [00:21<00:18,  2.35it/s] 54%|█████▍    | 50/93 [00:22<00:18,  2.37it/s] 55%|█████▍    | 51/93 [00:22<00:20,  2.06it/s] 56%|█████▌    | 52/93 [00:23<00:18,  2.16it/s] 57%|█████▋    | 53/93 [00:23<00:17,  2.29it/s] 58%|█████▊    | 54/93 [00:24<00:17,  2.26it/s] 59%|█████▉    | 55/93 [00:24<00:16,  2.24it/s] 60%|██████    | 56/93 [00:24<00:15,  2.34it/s] 61%|██████▏   | 57/93 [00:25<00:15,  2.40it/s] 62%|██████▏   | 58/93 [00:25<00:14,  2.41it/s] 63%|██████▎   | 59/93 [00:26<00:13,  2.53it/s] 65%|██████▍   | 60/93 [00:26<00:12,  2.66it/s] 66%|██████▌   | 61/93 [00:26<00:13,  2.44it/s] 67%|██████▋   | 62/93 [00:27<00:12,  2.51it/s] 68%|██████▊   | 63/93 [00:27<00:11,  2.66it/s] 69%|██████▉   | 64/93 [00:27<00:10,  2.69it/s] 70%|██████▉   | 65/93 [00:28<00:10,  2.66it/s] 71%|███████   | 66/93 [00:28<00:10,  2.62it/s] 72%|███████▏  | 67/93 [00:29<00:09,  2.63it/s] 73%|███████▎  | 68/93 [00:29<00:09,  2.53it/s] 74%|███████▍  | 69/93 [00:29<00:09,  2.41it/s] 75%|███████▌  | 70/93 [00:30<00:09,  2.42it/s] 76%|███████▋  | 71/93 [00:30<00:09,  2.27it/s] 77%|███████▋  | 72/93 [00:31<00:09,  2.25it/s] 78%|███████▊  | 73/93 [00:31<00:08,  2.37it/s] 80%|███████▉  | 74/93 [00:32<00:08,  2.36it/s] 81%|████████  | 75/93 [00:32<00:07,  2.38it/s] 82%|████████▏ | 76/93 [00:32<00:06,  2.48it/s] 83%|████████▎ | 77/93 [00:33<00:06,  2.60it/s] 84%|████████▍ | 78/93 [00:33<00:05,  2.64it/s] 85%|████████▍ | 79/93 [00:34<00:05,  2.49it/s] 86%|████████▌ | 80/93 [00:34<00:05,  2.41it/s] 87%|████████▋ | 81/93 [00:34<00:05,  2.38it/s] 88%|████████▊ | 82/93 [00:35<00:04,  2.53it/s] 89%|████████▉ | 83/93 [00:35<00:03,  2.58it/s] 90%|█████████ | 84/93 [00:36<00:03,  2.61it/s] 91%|█████████▏| 85/93 [00:36<00:03,  2.34it/s] 92%|█████████▏| 86/93 [00:37<00:03,  2.20it/s] 94%|█████████▎| 87/93 [00:37<00:02,  2.31it/s] 95%|█████████▍| 88/93 [00:37<00:02,  2.46it/s] 96%|█████████▌| 89/93 [00:38<00:01,  2.43it/s] 97%|█████████▋| 90/93 [00:38<00:01,  2.35it/s] 98%|█████████▊| 91/93 [00:39<00:00,  2.15it/s] 99%|█████████▉| 92/93 [00:39<00:00,  2.12it/s]100%|██████████| 93/93 [00:40<00:00,  2.29it/s]100%|██████████| 93/93 [00:40<00:00,  2.31it/s]
***** predict_test_ar_SA metrics *****
  predict_ex_match_acc         =     0.4526
  predict_ex_match_acc_stderr  =     0.0091
  predict_intent_acc           =     0.7135
  predict_intent_acc_stderr    =     0.0083
  predict_loss                 =     0.5892
  predict_runtime              = 0:00:41.35
  predict_samples              =       2974
  predict_samples_per_second   =     71.922
  predict_slot_micro_f1        =     0.6007
  predict_slot_micro_f1_stderr =     0.0039
  predict_steps_per_second     =      2.249
02/05/2024 22:56:03 - INFO - __main__ - *** test_de_DE ***
[INFO|trainer.py:718] 2024-02-05 22:56:03,080 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 22:56:03,083 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 22:56:03,083 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 22:56:03,083 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:18,  4.99it/s]  3%|▎         | 3/93 [00:00<00:23,  3.85it/s]  4%|▍         | 4/93 [00:01<00:29,  3.05it/s]  5%|▌         | 5/93 [00:01<00:30,  2.92it/s]  6%|▋         | 6/93 [00:01<00:29,  2.93it/s]  8%|▊         | 7/93 [00:02<00:32,  2.69it/s]  9%|▊         | 8/93 [00:02<00:31,  2.70it/s] 10%|▉         | 9/93 [00:03<00:32,  2.57it/s] 11%|█         | 10/93 [00:03<00:29,  2.78it/s] 12%|█▏        | 11/93 [00:03<00:30,  2.72it/s] 13%|█▎        | 12/93 [00:04<00:31,  2.58it/s] 14%|█▍        | 13/93 [00:04<00:29,  2.67it/s] 15%|█▌        | 14/93 [00:04<00:29,  2.65it/s] 16%|█▌        | 15/93 [00:05<00:33,  2.34it/s] 17%|█▋        | 16/93 [00:06<00:35,  2.19it/s] 18%|█▊        | 17/93 [00:06<00:34,  2.17it/s] 19%|█▉        | 18/93 [00:06<00:33,  2.24it/s] 20%|██        | 19/93 [00:07<00:32,  2.24it/s] 22%|██▏       | 20/93 [00:07<00:31,  2.29it/s] 23%|██▎       | 21/93 [00:08<00:30,  2.34it/s] 24%|██▎       | 22/93 [00:08<00:34,  2.08it/s] 25%|██▍       | 23/93 [00:09<00:34,  2.03it/s] 26%|██▌       | 24/93 [00:09<00:32,  2.14it/s] 27%|██▋       | 25/93 [00:10<00:30,  2.20it/s] 28%|██▊       | 26/93 [00:10<00:29,  2.25it/s] 29%|██▉       | 27/93 [00:10<00:28,  2.33it/s] 30%|███       | 28/93 [00:13<00:59,  1.09it/s] 31%|███       | 29/93 [00:13<00:48,  1.33it/s] 32%|███▏      | 30/93 [00:13<00:42,  1.50it/s] 33%|███▎      | 31/93 [00:14<00:35,  1.74it/s] 34%|███▍      | 32/93 [00:14<00:32,  1.85it/s] 35%|███▌      | 33/93 [00:15<00:33,  1.82it/s] 37%|███▋      | 34/93 [00:15<00:29,  2.00it/s] 38%|███▊      | 35/93 [00:16<00:29,  1.97it/s] 39%|███▊      | 36/93 [00:16<00:27,  2.10it/s] 40%|███▉      | 37/93 [00:16<00:25,  2.17it/s] 41%|████      | 38/93 [00:17<00:24,  2.23it/s] 42%|████▏     | 39/93 [00:17<00:25,  2.13it/s] 43%|████▎     | 40/93 [00:18<00:23,  2.25it/s] 44%|████▍     | 41/93 [00:18<00:22,  2.36it/s] 45%|████▌     | 42/93 [00:19<00:21,  2.38it/s] 46%|████▌     | 43/93 [00:19<00:20,  2.39it/s] 47%|████▋     | 44/93 [00:19<00:19,  2.49it/s] 48%|████▊     | 45/93 [00:20<00:19,  2.42it/s] 49%|████▉     | 46/93 [00:20<00:18,  2.52it/s] 51%|█████     | 47/93 [00:21<00:17,  2.62it/s] 52%|█████▏    | 48/93 [00:21<00:17,  2.55it/s] 53%|█████▎    | 49/93 [00:21<00:17,  2.53it/s] 54%|█████▍    | 50/93 [00:22<00:18,  2.30it/s] 55%|█████▍    | 51/93 [00:22<00:18,  2.26it/s] 56%|█████▌    | 52/93 [00:23<00:20,  2.00it/s] 57%|█████▋    | 53/93 [00:23<00:18,  2.12it/s] 58%|█████▊    | 54/93 [00:24<00:17,  2.24it/s] 59%|█████▉    | 55/93 [00:24<00:16,  2.36it/s] 60%|██████    | 56/93 [00:24<00:14,  2.50it/s] 61%|██████▏   | 57/93 [00:25<00:14,  2.42it/s] 62%|██████▏   | 58/93 [00:25<00:15,  2.30it/s] 63%|██████▎   | 59/93 [00:26<00:14,  2.36it/s] 65%|██████▍   | 60/93 [00:26<00:13,  2.42it/s] 66%|██████▌   | 61/93 [00:27<00:14,  2.23it/s] 67%|██████▋   | 62/93 [00:27<00:13,  2.24it/s] 68%|██████▊   | 63/93 [00:28<00:13,  2.30it/s] 69%|██████▉   | 64/93 [00:28<00:13,  2.19it/s] 70%|██████▉   | 65/93 [00:28<00:11,  2.38it/s] 71%|███████   | 66/93 [00:29<00:11,  2.44it/s] 72%|███████▏  | 67/93 [00:29<00:10,  2.44it/s] 73%|███████▎  | 68/93 [00:30<00:10,  2.42it/s] 74%|███████▍  | 69/93 [00:30<00:10,  2.23it/s] 75%|███████▌  | 70/93 [00:31<00:10,  2.17it/s] 76%|███████▋  | 71/93 [00:31<00:10,  2.10it/s] 77%|███████▋  | 72/93 [00:32<00:10,  2.01it/s] 78%|███████▊  | 73/93 [00:32<00:09,  2.11it/s] 80%|███████▉  | 74/93 [00:33<00:08,  2.13it/s] 81%|████████  | 75/93 [00:33<00:08,  2.01it/s] 82%|████████▏ | 76/93 [00:34<00:08,  2.04it/s] 83%|████████▎ | 77/93 [00:34<00:07,  2.14it/s] 84%|████████▍ | 78/93 [00:35<00:07,  2.12it/s] 85%|████████▍ | 79/93 [00:35<00:06,  2.17it/s] 86%|████████▌ | 80/93 [00:35<00:05,  2.18it/s] 87%|████████▋ | 81/93 [00:36<00:05,  2.25it/s] 88%|████████▊ | 82/93 [00:36<00:04,  2.28it/s] 89%|████████▉ | 83/93 [00:37<00:04,  2.34it/s] 90%|█████████ | 84/93 [00:37<00:03,  2.36it/s] 91%|█████████▏| 85/93 [00:37<00:03,  2.41it/s] 92%|█████████▏| 86/93 [00:38<00:02,  2.41it/s] 94%|█████████▎| 87/93 [00:38<00:02,  2.41it/s] 95%|█████████▍| 88/93 [00:39<00:02,  2.49it/s] 96%|█████████▌| 89/93 [00:39<00:01,  2.45it/s] 97%|█████████▋| 90/93 [00:39<00:01,  2.50it/s] 98%|█████████▊| 91/93 [00:40<00:00,  2.22it/s] 99%|█████████▉| 92/93 [00:40<00:00,  2.34it/s]100%|██████████| 93/93 [00:41<00:00,  2.44it/s]100%|██████████| 93/93 [00:41<00:00,  2.24it/s]
***** predict_test_de_DE metrics *****
  predict_ex_match_acc         =     0.6019
  predict_ex_match_acc_stderr  =      0.009
  predict_intent_acc           =     0.8507
  predict_intent_acc_stderr    =     0.0065
  predict_loss                 =     0.4208
  predict_runtime              = 0:00:41.93
  predict_samples              =       2974
  predict_samples_per_second   =     70.912
  predict_slot_micro_f1        =     0.7291
  predict_slot_micro_f1_stderr =     0.0032
  predict_steps_per_second     =      2.217
02/05/2024 22:56:45 - INFO - __main__ - *** test_es_ES ***
[INFO|trainer.py:718] 2024-02-05 22:56:45,260 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 22:56:45,263 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 22:56:45,263 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 22:56:45,263 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:19,  4.73it/s]  3%|▎         | 3/93 [00:00<00:27,  3.21it/s]  4%|▍         | 4/93 [00:01<00:35,  2.53it/s]  5%|▌         | 5/93 [00:01<00:37,  2.35it/s]  6%|▋         | 6/93 [00:02<00:36,  2.36it/s]  8%|▊         | 7/93 [00:02<00:37,  2.32it/s]  9%|▊         | 8/93 [00:03<00:35,  2.41it/s] 10%|▉         | 9/93 [00:03<00:34,  2.40it/s] 11%|█         | 10/93 [00:03<00:34,  2.42it/s] 12%|█▏        | 11/93 [00:04<00:35,  2.32it/s] 13%|█▎        | 12/93 [00:04<00:36,  2.19it/s] 14%|█▍        | 13/93 [00:05<00:36,  2.22it/s] 15%|█▌        | 14/93 [00:05<00:35,  2.24it/s] 16%|█▌        | 15/93 [00:06<00:37,  2.07it/s] 17%|█▋        | 16/93 [00:06<00:38,  1.98it/s] 18%|█▊        | 17/93 [00:07<00:36,  2.05it/s] 19%|█▉        | 18/93 [00:07<00:35,  2.14it/s] 20%|██        | 19/93 [00:08<00:34,  2.13it/s] 22%|██▏       | 20/93 [00:08<00:32,  2.21it/s] 23%|██▎       | 21/93 [00:09<00:31,  2.26it/s] 24%|██▎       | 22/93 [00:09<00:37,  1.91it/s] 25%|██▍       | 23/93 [00:10<00:37,  1.88it/s] 26%|██▌       | 24/93 [00:10<00:35,  1.95it/s] 27%|██▋       | 25/93 [00:11<00:33,  2.05it/s] 28%|██▊       | 26/93 [00:11<00:32,  2.09it/s] 29%|██▉       | 27/93 [00:12<00:31,  2.11it/s] 30%|███       | 28/93 [00:14<01:02,  1.05it/s] 31%|███       | 29/93 [00:14<00:52,  1.22it/s] 32%|███▏      | 30/93 [00:15<00:45,  1.39it/s] 33%|███▎      | 31/93 [00:15<00:38,  1.61it/s] 34%|███▍      | 32/93 [00:16<00:37,  1.65it/s] 35%|███▌      | 33/93 [00:16<00:38,  1.55it/s] 37%|███▋      | 34/93 [00:17<00:34,  1.72it/s] 38%|███▊      | 35/93 [00:17<00:31,  1.82it/s] 39%|███▊      | 36/93 [00:18<00:30,  1.89it/s] 40%|███▉      | 37/93 [00:19<00:31,  1.77it/s] 41%|████      | 38/93 [00:19<00:31,  1.76it/s] 42%|████▏     | 39/93 [00:20<00:29,  1.85it/s] 43%|████▎     | 40/93 [00:20<00:26,  1.97it/s] 44%|████▍     | 41/93 [00:20<00:25,  2.01it/s] 45%|████▌     | 42/93 [00:21<00:25,  2.04it/s] 46%|████▌     | 43/93 [00:21<00:23,  2.17it/s] 47%|████▋     | 44/93 [00:22<00:21,  2.29it/s] 48%|████▊     | 45/93 [00:22<00:21,  2.20it/s] 49%|████▉     | 46/93 [00:23<00:23,  2.01it/s] 51%|█████     | 47/93 [00:23<00:21,  2.18it/s] 52%|█████▏    | 48/93 [00:24<00:24,  1.87it/s] 53%|█████▎    | 49/93 [00:24<00:21,  2.03it/s] 54%|█████▍    | 50/93 [00:25<00:21,  2.04it/s] 55%|█████▍    | 51/93 [00:25<00:19,  2.13it/s] 56%|█████▌    | 52/93 [00:26<00:22,  1.84it/s] 57%|█████▋    | 53/93 [00:26<00:20,  1.99it/s] 58%|█████▊    | 54/93 [00:27<00:19,  2.05it/s] 59%|█████▉    | 55/93 [00:27<00:17,  2.15it/s] 60%|██████    | 56/93 [00:28<00:17,  2.16it/s] 61%|██████▏   | 57/93 [00:28<00:17,  2.00it/s] 62%|██████▏   | 58/93 [00:29<00:16,  2.07it/s] 63%|██████▎   | 59/93 [00:29<00:17,  1.94it/s] 65%|██████▍   | 60/93 [00:30<00:15,  2.13it/s] 66%|██████▌   | 61/93 [00:30<00:15,  2.09it/s] 67%|██████▋   | 62/93 [00:31<00:15,  2.01it/s] 68%|██████▊   | 63/93 [00:31<00:13,  2.18it/s] 69%|██████▉   | 64/93 [00:32<00:14,  2.04it/s] 70%|██████▉   | 65/93 [00:32<00:12,  2.16it/s] 71%|███████   | 66/93 [00:32<00:11,  2.27it/s] 72%|███████▏  | 67/93 [00:33<00:13,  1.90it/s] 73%|███████▎  | 68/93 [00:34<00:13,  1.86it/s] 74%|███████▍  | 69/93 [00:34<00:13,  1.84it/s] 75%|███████▌  | 70/93 [00:35<00:12,  1.88it/s] 76%|███████▋  | 71/93 [00:35<00:12,  1.80it/s] 77%|███████▋  | 72/93 [00:36<00:11,  1.83it/s] 78%|███████▊  | 73/93 [00:36<00:10,  1.98it/s] 80%|███████▉  | 74/93 [00:37<00:09,  1.97it/s] 81%|████████  | 75/93 [00:37<00:09,  1.92it/s] 82%|████████▏ | 76/93 [00:38<00:09,  1.83it/s] 83%|████████▎ | 77/93 [00:39<00:08,  1.81it/s] 84%|████████▍ | 78/93 [00:40<00:14,  1.02it/s] 85%|████████▍ | 79/93 [00:41<00:11,  1.23it/s] 86%|████████▌ | 80/93 [00:41<00:09,  1.36it/s] 87%|████████▋ | 81/93 [00:42<00:07,  1.53it/s] 88%|████████▊ | 82/93 [00:42<00:06,  1.69it/s] 89%|████████▉ | 83/93 [00:43<00:05,  1.84it/s] 90%|█████████ | 84/93 [00:43<00:04,  1.92it/s] 91%|█████████▏| 85/93 [00:44<00:04,  1.93it/s] 92%|█████████▏| 86/93 [00:44<00:03,  1.93it/s] 94%|█████████▎| 87/93 [00:45<00:02,  2.02it/s] 95%|█████████▍| 88/93 [00:45<00:02,  2.19it/s] 96%|█████████▌| 89/93 [00:46<00:01,  2.07it/s] 97%|█████████▋| 90/93 [00:46<00:01,  2.16it/s] 98%|█████████▊| 91/93 [00:47<00:00,  2.04it/s] 99%|█████████▉| 92/93 [00:47<00:00,  2.02it/s]100%|██████████| 93/93 [00:47<00:00,  2.21it/s]100%|██████████| 93/93 [00:48<00:00,  1.92it/s]
***** predict_test_es_ES metrics *****
  predict_ex_match_acc         =     0.6231
  predict_ex_match_acc_stderr  =     0.0089
  predict_intent_acc           =     0.8591
  predict_intent_acc_stderr    =     0.0064
  predict_loss                 =     0.2681
  predict_runtime              = 0:00:48.92
  predict_samples              =       2974
  predict_samples_per_second   =     60.782
  predict_slot_micro_f1        =     0.7314
  predict_slot_micro_f1_stderr =      0.003
  predict_steps_per_second     =      1.901
02/05/2024 22:57:34 - INFO - __main__ - *** test_fr_FR ***
[INFO|trainer.py:718] 2024-02-05 22:57:34,438 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 22:57:34,440 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 22:57:34,441 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 22:57:34,441 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:19,  4.68it/s]  3%|▎         | 3/93 [00:00<00:28,  3.12it/s]  4%|▍         | 4/93 [00:01<00:35,  2.54it/s]  5%|▌         | 5/93 [00:01<00:39,  2.24it/s]  6%|▋         | 6/93 [00:02<00:39,  2.23it/s]  8%|▊         | 7/93 [00:02<00:39,  2.16it/s]  9%|▊         | 8/93 [00:03<00:38,  2.21it/s] 10%|▉         | 9/93 [00:03<00:37,  2.26it/s] 11%|█         | 10/93 [00:04<00:35,  2.31it/s] 12%|█▏        | 11/93 [00:04<00:38,  2.16it/s] 13%|█▎        | 12/93 [00:05<00:37,  2.14it/s] 14%|█▍        | 13/93 [00:05<00:38,  2.10it/s] 15%|█▌        | 14/93 [00:06<00:39,  1.99it/s] 16%|█▌        | 15/93 [00:06<00:40,  1.90it/s] 17%|█▋        | 16/93 [00:07<00:41,  1.86it/s] 18%|█▊        | 17/93 [00:07<00:42,  1.80it/s] 19%|█▉        | 18/93 [00:08<00:39,  1.92it/s] 20%|██        | 19/93 [00:08<00:37,  2.00it/s] 22%|██▏       | 20/93 [00:09<00:36,  2.00it/s] 23%|██▎       | 21/93 [00:09<00:35,  2.03it/s] 24%|██▎       | 22/93 [00:10<00:35,  2.02it/s] 25%|██▍       | 23/93 [00:10<00:36,  1.93it/s] 26%|██▌       | 24/93 [00:11<00:36,  1.87it/s] 27%|██▋       | 25/93 [00:12<00:36,  1.89it/s] 28%|██▊       | 26/93 [00:12<00:33,  1.98it/s] 29%|██▉       | 27/93 [00:12<00:32,  2.00it/s] 30%|███       | 28/93 [00:15<01:03,  1.02it/s] 31%|███       | 29/93 [00:15<00:52,  1.22it/s] 32%|███▏      | 30/93 [00:15<00:45,  1.39it/s] 33%|███▎      | 31/93 [00:16<00:38,  1.60it/s] 34%|███▍      | 32/93 [00:16<00:35,  1.69it/s] 35%|███▌      | 33/93 [00:17<00:34,  1.76it/s] 37%|███▋      | 34/93 [00:17<00:30,  1.93it/s] 38%|███▊      | 35/93 [00:18<00:31,  1.83it/s] 39%|███▊      | 36/93 [00:18<00:30,  1.87it/s] 40%|███▉      | 37/93 [00:19<00:29,  1.88it/s] 41%|████      | 38/93 [00:20<00:29,  1.84it/s] 42%|████▏     | 39/93 [00:20<00:29,  1.86it/s] 43%|████▎     | 40/93 [00:21<00:27,  1.90it/s] 44%|████▍     | 41/93 [00:21<00:26,  1.93it/s] 45%|████▌     | 42/93 [00:22<00:25,  1.99it/s] 46%|████▌     | 43/93 [00:22<00:25,  1.97it/s] 47%|████▋     | 44/93 [00:22<00:23,  2.08it/s] 48%|████▊     | 45/93 [00:23<00:23,  2.08it/s] 49%|████▉     | 46/93 [00:23<00:21,  2.14it/s] 51%|█████     | 47/93 [00:24<00:22,  2.02it/s] 52%|█████▏    | 48/93 [00:25<00:24,  1.84it/s] 53%|█████▎    | 49/93 [00:25<00:23,  1.84it/s] 54%|█████▍    | 50/93 [00:26<00:23,  1.86it/s] 55%|█████▍    | 51/93 [00:26<00:21,  1.94it/s] 56%|█████▌    | 52/93 [00:27<00:24,  1.67it/s] 57%|█████▋    | 53/93 [00:27<00:21,  1.84it/s] 58%|█████▊    | 54/93 [00:28<00:20,  1.95it/s] 59%|█████▉    | 55/93 [00:28<00:19,  1.97it/s] 60%|██████    | 56/93 [00:29<00:18,  1.99it/s] 61%|██████▏   | 57/93 [00:29<00:17,  2.03it/s] 62%|██████▏   | 58/93 [00:30<00:17,  2.01it/s] 63%|██████▎   | 59/93 [00:30<00:16,  2.06it/s] 65%|██████▍   | 60/93 [00:31<00:15,  2.15it/s] 66%|██████▌   | 61/93 [00:31<00:16,  2.00it/s] 67%|██████▋   | 62/93 [00:32<00:15,  1.99it/s] 68%|██████▊   | 63/93 [00:32<00:14,  2.13it/s] 69%|██████▉   | 64/93 [00:32<00:13,  2.23it/s] 70%|██████▉   | 65/93 [00:33<00:12,  2.29it/s] 71%|███████   | 66/93 [00:33<00:11,  2.31it/s] 72%|███████▏  | 67/93 [00:34<00:11,  2.26it/s] 73%|███████▎  | 68/93 [00:34<00:12,  2.03it/s] 74%|███████▍  | 69/93 [00:35<00:12,  1.97it/s] 75%|███████▌  | 70/93 [00:35<00:11,  1.92it/s] 76%|███████▋  | 71/93 [00:37<00:21,  1.04it/s] 77%|███████▋  | 72/93 [00:38<00:17,  1.20it/s] 78%|███████▊  | 73/93 [00:38<00:14,  1.41it/s] 80%|███████▉  | 74/93 [00:39<00:12,  1.55it/s] 81%|████████  | 75/93 [00:40<00:11,  1.60it/s] 82%|████████▏ | 76/93 [00:40<00:10,  1.64it/s] 83%|████████▎ | 77/93 [00:41<00:09,  1.71it/s] 84%|████████▍ | 78/93 [00:41<00:08,  1.67it/s] 85%|████████▍ | 79/93 [00:42<00:07,  1.80it/s] 86%|████████▌ | 80/93 [00:42<00:07,  1.84it/s] 87%|████████▋ | 81/93 [00:43<00:06,  1.93it/s] 88%|████████▊ | 82/93 [00:43<00:05,  2.02it/s] 89%|████████▉ | 83/93 [00:44<00:04,  2.09it/s] 90%|█████████ | 84/93 [00:44<00:04,  2.14it/s] 91%|█████████▏| 85/93 [00:44<00:03,  2.08it/s] 92%|█████████▏| 86/93 [00:45<00:03,  2.11it/s] 94%|█████████▎| 87/93 [00:45<00:02,  2.16it/s] 95%|█████████▍| 88/93 [00:46<00:02,  2.22it/s] 96%|█████████▌| 89/93 [00:46<00:01,  2.21it/s] 97%|█████████▋| 90/93 [00:47<00:01,  2.21it/s] 98%|█████████▊| 91/93 [00:47<00:00,  2.01it/s] 99%|█████████▉| 92/93 [00:48<00:00,  1.96it/s]100%|██████████| 93/93 [00:48<00:00,  2.14it/s]100%|██████████| 93/93 [00:49<00:00,  1.90it/s]
***** predict_test_fr_FR metrics *****
  predict_ex_match_acc         =      0.466
  predict_ex_match_acc_stderr  =     0.0091
  predict_intent_acc           =      0.847
  predict_intent_acc_stderr    =     0.0066
  predict_loss                 =     0.6841
  predict_runtime              = 0:00:49.51
  predict_samples              =       2974
  predict_samples_per_second   =     60.061
  predict_slot_micro_f1        =     0.5181
  predict_slot_micro_f1_stderr =     0.0034
  predict_steps_per_second     =      1.878
02/05/2024 22:58:24 - INFO - __main__ - *** test_hu_HU ***
[INFO|trainer.py:718] 2024-02-05 22:58:24,215 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 22:58:24,218 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 22:58:24,219 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 22:58:24,219 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:15,  5.83it/s]  3%|▎         | 3/93 [00:00<00:20,  4.32it/s]  4%|▍         | 4/93 [00:01<00:30,  2.89it/s]  5%|▌         | 5/93 [00:01<00:30,  2.84it/s]  6%|▋         | 6/93 [00:01<00:29,  2.92it/s]  8%|▊         | 7/93 [00:02<00:31,  2.74it/s]  9%|▊         | 8/93 [00:02<00:29,  2.84it/s] 10%|▉         | 9/93 [00:02<00:30,  2.78it/s] 11%|█         | 10/93 [00:03<00:30,  2.70it/s] 12%|█▏        | 11/93 [00:03<00:30,  2.73it/s] 13%|█▎        | 12/93 [00:04<00:30,  2.63it/s] 14%|█▍        | 13/93 [00:04<00:29,  2.69it/s] 15%|█▌        | 14/93 [00:04<00:30,  2.59it/s] 16%|█▌        | 15/93 [00:05<00:31,  2.49it/s] 17%|█▋        | 16/93 [00:05<00:32,  2.37it/s] 18%|█▊        | 17/93 [00:06<00:31,  2.40it/s] 19%|█▉        | 18/93 [00:06<00:31,  2.40it/s] 20%|██        | 19/93 [00:07<00:29,  2.51it/s] 22%|██▏       | 20/93 [00:07<00:28,  2.60it/s] 23%|██▎       | 21/93 [00:07<00:27,  2.61it/s] 24%|██▎       | 22/93 [00:08<00:32,  2.21it/s] 25%|██▍       | 23/93 [00:08<00:30,  2.27it/s] 26%|██▌       | 24/93 [00:09<00:30,  2.28it/s] 27%|██▋       | 25/93 [00:09<00:31,  2.17it/s] 28%|██▊       | 26/93 [00:10<00:29,  2.31it/s] 29%|██▉       | 27/93 [00:10<00:26,  2.48it/s] 30%|███       | 28/93 [00:12<00:59,  1.10it/s] 31%|███       | 29/93 [00:12<00:48,  1.32it/s] 32%|███▏      | 30/93 [00:13<00:43,  1.46it/s] 33%|███▎      | 31/93 [00:13<00:35,  1.73it/s] 34%|███▍      | 32/93 [00:14<00:34,  1.75it/s] 35%|███▌      | 33/93 [00:14<00:33,  1.77it/s] 37%|███▋      | 34/93 [00:15<00:30,  1.97it/s] 38%|███▊      | 35/93 [00:15<00:27,  2.12it/s] 39%|███▊      | 36/93 [00:16<00:25,  2.23it/s] 40%|███▉      | 37/93 [00:16<00:24,  2.29it/s] 41%|████      | 38/93 [00:16<00:23,  2.32it/s] 42%|████▏     | 39/93 [00:17<00:23,  2.26it/s] 43%|████▎     | 40/93 [00:17<00:22,  2.37it/s] 44%|████▍     | 41/93 [00:18<00:22,  2.28it/s] 45%|████▌     | 42/93 [00:18<00:21,  2.32it/s] 46%|████▌     | 43/93 [00:18<00:20,  2.43it/s] 47%|████▋     | 44/93 [00:19<00:20,  2.33it/s] 48%|████▊     | 45/93 [00:19<00:20,  2.37it/s] 49%|████▉     | 46/93 [00:20<00:19,  2.36it/s] 51%|█████     | 47/93 [00:20<00:18,  2.53it/s] 52%|█████▏    | 48/93 [00:21<00:18,  2.49it/s] 53%|█████▎    | 49/93 [00:21<00:17,  2.57it/s] 54%|█████▍    | 50/93 [00:21<00:16,  2.59it/s] 55%|█████▍    | 51/93 [00:22<00:16,  2.61it/s] 56%|█████▌    | 52/93 [00:22<00:16,  2.44it/s] 57%|█████▋    | 53/93 [00:23<00:17,  2.32it/s] 58%|█████▊    | 54/93 [00:23<00:15,  2.44it/s] 59%|█████▉    | 55/93 [00:23<00:14,  2.57it/s] 60%|██████    | 56/93 [00:24<00:14,  2.51it/s] 61%|██████▏   | 57/93 [00:24<00:14,  2.47it/s] 62%|██████▏   | 58/93 [00:24<00:13,  2.55it/s] 63%|██████▎   | 59/93 [00:25<00:13,  2.51it/s] 65%|██████▍   | 60/93 [00:25<00:12,  2.60it/s] 66%|██████▌   | 61/93 [00:26<00:13,  2.33it/s] 67%|██████▋   | 62/93 [00:26<00:13,  2.27it/s] 68%|██████▊   | 63/93 [00:27<00:12,  2.43it/s] 69%|██████▉   | 64/93 [00:27<00:12,  2.41it/s] 70%|██████▉   | 65/93 [00:27<00:11,  2.40it/s] 71%|███████   | 66/93 [00:28<00:11,  2.45it/s] 72%|███████▏  | 67/93 [00:28<00:10,  2.45it/s] 73%|███████▎  | 68/93 [00:29<00:10,  2.41it/s] 74%|███████▍  | 69/93 [00:29<00:10,  2.27it/s] 75%|███████▌  | 70/93 [00:30<00:10,  2.23it/s] 76%|███████▋  | 71/93 [00:30<00:10,  2.04it/s] 77%|███████▋  | 72/93 [00:31<00:09,  2.23it/s] 78%|███████▊  | 73/93 [00:31<00:08,  2.38it/s] 80%|███████▉  | 74/93 [00:31<00:08,  2.36it/s] 81%|████████  | 75/93 [00:32<00:07,  2.30it/s] 82%|████████▏ | 76/93 [00:32<00:07,  2.27it/s] 83%|████████▎ | 77/93 [00:33<00:07,  2.28it/s] 84%|████████▍ | 78/93 [00:33<00:06,  2.22it/s] 85%|████████▍ | 79/93 [00:34<00:06,  2.13it/s] 86%|████████▌ | 80/93 [00:34<00:06,  2.11it/s] 87%|████████▋ | 81/93 [00:35<00:05,  2.21it/s] 88%|████████▊ | 82/93 [00:35<00:05,  2.18it/s] 89%|████████▉ | 83/93 [00:35<00:04,  2.20it/s] 90%|█████████ | 84/93 [00:36<00:03,  2.28it/s] 91%|█████████▏| 85/93 [00:36<00:03,  2.31it/s] 92%|█████████▏| 86/93 [00:37<00:03,  2.04it/s] 94%|█████████▎| 87/93 [00:37<00:02,  2.13it/s] 95%|█████████▍| 88/93 [00:38<00:02,  2.30it/s] 96%|█████████▌| 89/93 [00:38<00:01,  2.19it/s] 97%|█████████▋| 90/93 [00:39<00:01,  2.24it/s] 98%|█████████▊| 91/93 [00:39<00:00,  2.10it/s] 99%|█████████▉| 92/93 [00:40<00:00,  1.99it/s]100%|██████████| 93/93 [00:40<00:00,  2.21it/s]100%|██████████| 93/93 [00:40<00:00,  2.28it/s]
***** predict_test_hu_HU metrics *****
  predict_ex_match_acc         =      0.534
  predict_ex_match_acc_stderr  =     0.0091
  predict_intent_acc           =     0.8184
  predict_intent_acc_stderr    =     0.0071
  predict_loss                 =     0.5463
  predict_runtime              = 0:00:41.23
  predict_samples              =       2974
  predict_samples_per_second   =     72.121
  predict_slot_micro_f1        =     0.6581
  predict_slot_micro_f1_stderr =     0.0036
  predict_steps_per_second     =      2.255
02/05/2024 22:59:05 - INFO - __main__ - *** test_ko_KR ***
[INFO|trainer.py:718] 2024-02-05 22:59:05,706 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 22:59:05,709 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 22:59:05,709 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 22:59:05,710 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:14,  6.10it/s]  3%|▎         | 3/93 [00:00<00:20,  4.45it/s]  4%|▍         | 4/93 [00:00<00:24,  3.69it/s]  5%|▌         | 5/93 [00:01<00:25,  3.49it/s]  6%|▋         | 6/93 [00:01<00:25,  3.40it/s]  8%|▊         | 7/93 [00:01<00:27,  3.15it/s]  9%|▊         | 8/93 [00:02<00:26,  3.20it/s] 10%|▉         | 9/93 [00:02<00:28,  2.93it/s] 11%|█         | 10/93 [00:03<00:29,  2.81it/s] 12%|█▏        | 11/93 [00:03<00:29,  2.75it/s] 13%|█▎        | 12/93 [00:04<00:34,  2.37it/s] 14%|█▍        | 13/93 [00:04<00:33,  2.41it/s] 15%|█▌        | 14/93 [00:04<00:31,  2.55it/s] 16%|█▌        | 15/93 [00:05<00:32,  2.37it/s] 17%|█▋        | 16/93 [00:05<00:30,  2.53it/s] 18%|█▊        | 17/93 [00:05<00:29,  2.56it/s] 19%|█▉        | 18/93 [00:06<00:29,  2.57it/s] 20%|██        | 19/93 [00:06<00:27,  2.70it/s] 22%|██▏       | 20/93 [00:06<00:25,  2.82it/s] 23%|██▎       | 21/93 [00:07<00:24,  2.89it/s] 24%|██▎       | 22/93 [00:07<00:24,  2.85it/s] 25%|██▍       | 23/93 [00:08<00:25,  2.78it/s] 26%|██▌       | 24/93 [00:08<00:24,  2.78it/s] 27%|██▋       | 25/93 [00:08<00:26,  2.55it/s] 28%|██▊       | 26/93 [00:09<00:24,  2.72it/s] 29%|██▉       | 27/93 [00:09<00:23,  2.81it/s] 30%|███       | 28/93 [00:11<00:56,  1.16it/s] 31%|███       | 29/93 [00:11<00:45,  1.42it/s] 32%|███▏      | 30/93 [00:12<00:38,  1.65it/s] 33%|███▎      | 31/93 [00:12<00:31,  1.96it/s] 34%|███▍      | 32/93 [00:12<00:29,  2.09it/s] 35%|███▌      | 33/93 [00:13<00:26,  2.24it/s] 37%|███▋      | 34/93 [00:13<00:23,  2.53it/s] 38%|███▊      | 35/93 [00:14<00:24,  2.41it/s] 39%|███▊      | 36/93 [00:14<00:22,  2.56it/s] 40%|███▉      | 37/93 [00:14<00:22,  2.52it/s] 41%|████      | 38/93 [00:15<00:22,  2.49it/s] 42%|████▏     | 39/93 [00:15<00:22,  2.42it/s] 43%|████▎     | 40/93 [00:15<00:20,  2.61it/s] 44%|████▍     | 41/93 [00:16<00:19,  2.69it/s] 45%|████▌     | 42/93 [00:16<00:18,  2.78it/s] 46%|████▌     | 43/93 [00:16<00:17,  2.85it/s] 47%|████▋     | 44/93 [00:17<00:18,  2.69it/s] 48%|████▊     | 45/93 [00:17<00:18,  2.65it/s] 49%|████▉     | 46/93 [00:18<00:16,  2.84it/s] 51%|█████     | 47/93 [00:18<00:15,  2.88it/s] 52%|█████▏    | 48/93 [00:18<00:15,  2.83it/s] 53%|█████▎    | 49/93 [00:19<00:14,  2.94it/s] 54%|█████▍    | 50/93 [00:19<00:15,  2.78it/s] 55%|█████▍    | 51/93 [00:19<00:15,  2.78it/s] 56%|█████▌    | 52/93 [00:20<00:15,  2.67it/s] 57%|█████▋    | 53/93 [00:20<00:15,  2.62it/s] 58%|█████▊    | 54/93 [00:21<00:14,  2.72it/s] 59%|█████▉    | 55/93 [00:21<00:13,  2.79it/s] 60%|██████    | 56/93 [00:21<00:13,  2.79it/s] 61%|██████▏   | 57/93 [00:22<00:12,  2.81it/s] 62%|██████▏   | 58/93 [00:22<00:12,  2.88it/s] 63%|██████▎   | 59/93 [00:22<00:11,  2.84it/s] 65%|██████▍   | 60/93 [00:23<00:11,  2.80it/s] 66%|██████▌   | 61/93 [00:23<00:11,  2.73it/s] 67%|██████▋   | 62/93 [00:23<00:11,  2.75it/s] 68%|██████▊   | 63/93 [00:24<00:10,  2.92it/s] 69%|██████▉   | 64/93 [00:24<00:09,  3.00it/s] 70%|██████▉   | 65/93 [00:24<00:09,  3.10it/s] 71%|███████   | 66/93 [00:25<00:08,  3.08it/s] 72%|███████▏  | 67/93 [00:25<00:08,  3.04it/s] 73%|███████▎  | 68/93 [00:25<00:08,  2.90it/s] 74%|███████▍  | 69/93 [00:26<00:08,  2.86it/s] 75%|███████▌  | 70/93 [00:26<00:08,  2.76it/s] 76%|███████▋  | 71/93 [00:27<00:08,  2.49it/s] 77%|███████▋  | 72/93 [00:27<00:07,  2.69it/s] 78%|███████▊  | 73/93 [00:27<00:07,  2.82it/s] 80%|███████▉  | 74/93 [00:28<00:06,  2.78it/s] 81%|████████  | 75/93 [00:28<00:06,  2.66it/s] 82%|████████▏ | 76/93 [00:28<00:06,  2.73it/s] 83%|████████▎ | 77/93 [00:29<00:05,  2.78it/s] 84%|████████▍ | 78/93 [00:29<00:06,  2.25it/s] 85%|████████▍ | 79/93 [00:30<00:05,  2.39it/s] 86%|████████▌ | 80/93 [00:30<00:05,  2.43it/s] 87%|████████▋ | 81/93 [00:30<00:04,  2.56it/s] 88%|████████▊ | 82/93 [00:31<00:04,  2.67it/s] 89%|████████▉ | 83/93 [00:31<00:03,  2.81it/s] 90%|█████████ | 84/93 [00:31<00:03,  2.91it/s] 91%|█████████▏| 85/93 [00:32<00:02,  2.83it/s] 92%|█████████▏| 86/93 [00:32<00:02,  2.67it/s] 94%|█████████▎| 87/93 [00:32<00:02,  2.78it/s] 95%|█████████▍| 88/93 [00:33<00:01,  2.85it/s] 96%|█████████▌| 89/93 [00:33<00:01,  2.83it/s] 97%|█████████▋| 90/93 [00:34<00:01,  2.84it/s] 98%|█████████▊| 91/93 [00:34<00:00,  2.62it/s] 99%|█████████▉| 92/93 [00:34<00:00,  2.54it/s]100%|██████████| 93/93 [00:35<00:00,  2.78it/s]100%|██████████| 93/93 [00:35<00:00,  2.63it/s]
***** predict_test_ko_KR metrics *****
  predict_ex_match_acc         =     0.5562
  predict_ex_match_acc_stderr  =     0.0091
  predict_intent_acc           =     0.8305
  predict_intent_acc_stderr    =     0.0069
  predict_loss                 =     0.8564
  predict_runtime              = 0:00:35.72
  predict_samples              =       2974
  predict_samples_per_second   =     83.241
  predict_slot_micro_f1        =      0.654
  predict_slot_micro_f1_stderr =      0.004
  predict_steps_per_second     =      2.603
02/05/2024 22:59:41 - INFO - __main__ - *** test_nl_NL ***
[INFO|trainer.py:718] 2024-02-05 22:59:41,670 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 22:59:41,672 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 22:59:41,673 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 22:59:41,673 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:19,  4.57it/s]  3%|▎         | 3/93 [00:00<00:31,  2.84it/s]  4%|▍         | 4/93 [00:01<00:33,  2.67it/s]  5%|▌         | 5/93 [00:01<00:33,  2.60it/s]  6%|▋         | 6/93 [00:02<00:33,  2.56it/s]  8%|▊         | 7/93 [00:02<00:35,  2.41it/s]  9%|▊         | 8/93 [00:03<00:35,  2.41it/s] 10%|▉         | 9/93 [00:03<00:35,  2.34it/s] 11%|█         | 10/93 [00:04<00:38,  2.17it/s] 12%|█▏        | 11/93 [00:04<00:38,  2.10it/s] 13%|█▎        | 12/93 [00:05<00:38,  2.10it/s] 14%|█▍        | 13/93 [00:05<00:37,  2.14it/s] 15%|█▌        | 14/93 [00:05<00:35,  2.21it/s] 16%|█▌        | 15/93 [00:06<00:39,  1.98it/s] 17%|█▋        | 16/93 [00:07<00:40,  1.91it/s] 18%|█▊        | 17/93 [00:07<00:40,  1.88it/s] 19%|█▉        | 18/93 [00:08<00:37,  1.98it/s] 20%|██        | 19/93 [00:08<00:35,  2.06it/s] 22%|██▏       | 20/93 [00:08<00:34,  2.11it/s] 23%|██▎       | 21/93 [00:09<00:32,  2.20it/s] 24%|██▎       | 22/93 [00:10<00:36,  1.97it/s] 25%|██▍       | 23/93 [00:10<00:34,  2.04it/s] 26%|██▌       | 24/93 [00:10<00:34,  2.01it/s] 27%|██▋       | 25/93 [00:11<00:33,  2.05it/s] 28%|██▊       | 26/93 [00:11<00:31,  2.11it/s] 29%|██▉       | 27/93 [00:12<00:30,  2.19it/s] 30%|███       | 28/93 [00:14<01:01,  1.06it/s] 31%|███       | 29/93 [00:14<00:50,  1.26it/s] 32%|███▏      | 30/93 [00:15<00:43,  1.44it/s] 33%|███▎      | 31/93 [00:15<00:38,  1.63it/s] 34%|███▍      | 32/93 [00:16<00:35,  1.73it/s] 35%|███▌      | 33/93 [00:16<00:35,  1.70it/s] 37%|███▋      | 34/93 [00:17<00:31,  1.89it/s] 38%|███▊      | 35/93 [00:17<00:30,  1.92it/s] 39%|███▊      | 36/93 [00:18<00:28,  2.03it/s] 40%|███▉      | 37/93 [00:18<00:26,  2.08it/s] 41%|████      | 38/93 [00:19<00:30,  1.79it/s] 42%|████▏     | 39/93 [00:19<00:30,  1.79it/s] 43%|████▎     | 40/93 [00:20<00:27,  1.94it/s] 44%|████▍     | 41/93 [00:20<00:24,  2.11it/s] 45%|████▌     | 42/93 [00:21<00:23,  2.18it/s] 46%|████▌     | 43/93 [00:21<00:21,  2.31it/s] 47%|████▋     | 44/93 [00:21<00:20,  2.36it/s] 48%|████▊     | 45/93 [00:22<00:20,  2.36it/s] 49%|████▉     | 46/93 [00:22<00:19,  2.35it/s] 51%|█████     | 47/93 [00:23<00:18,  2.44it/s] 52%|█████▏    | 48/93 [00:23<00:18,  2.41it/s] 53%|█████▎    | 49/93 [00:23<00:18,  2.41it/s] 54%|█████▍    | 50/93 [00:24<00:17,  2.44it/s] 55%|█████▍    | 51/93 [00:24<00:19,  2.21it/s] 56%|█████▌    | 52/93 [00:25<00:19,  2.12it/s] 57%|█████▋    | 53/93 [00:25<00:19,  2.09it/s] 58%|█████▊    | 54/93 [00:26<00:17,  2.17it/s] 59%|█████▉    | 55/93 [00:26<00:16,  2.37it/s] 60%|██████    | 56/93 [00:27<00:15,  2.42it/s] 61%|██████▏   | 57/93 [00:27<00:15,  2.37it/s] 62%|██████▏   | 58/93 [00:27<00:15,  2.29it/s] 63%|██████▎   | 59/93 [00:28<00:14,  2.30it/s] 65%|██████▍   | 60/93 [00:28<00:14,  2.35it/s] 66%|██████▌   | 61/93 [00:29<00:14,  2.16it/s] 67%|██████▋   | 62/93 [00:29<00:14,  2.15it/s] 68%|██████▊   | 63/93 [00:30<00:13,  2.28it/s] 69%|██████▉   | 64/93 [00:30<00:13,  2.16it/s] 70%|██████▉   | 65/93 [00:31<00:12,  2.33it/s] 71%|███████   | 66/93 [00:31<00:11,  2.34it/s] 72%|███████▏  | 67/93 [00:31<00:10,  2.38it/s] 73%|███████▎  | 68/93 [00:32<00:10,  2.30it/s] 74%|███████▍  | 69/93 [00:32<00:11,  2.12it/s] 75%|███████▌  | 70/93 [00:33<00:10,  2.15it/s] 76%|███████▋  | 71/93 [00:34<00:11,  1.95it/s] 77%|███████▋  | 72/93 [00:34<00:10,  1.93it/s] 78%|███████▊  | 73/93 [00:34<00:09,  2.06it/s] 80%|███████▉  | 74/93 [00:35<00:09,  1.99it/s] 81%|████████  | 75/93 [00:36<00:09,  1.97it/s] 82%|████████▏ | 76/93 [00:36<00:08,  2.12it/s] 83%|████████▎ | 77/93 [00:36<00:07,  2.16it/s] 84%|████████▍ | 78/93 [00:37<00:07,  1.99it/s] 85%|████████▍ | 79/93 [00:37<00:07,  1.97it/s] 86%|████████▌ | 80/93 [00:38<00:06,  2.02it/s] 87%|████████▋ | 81/93 [00:38<00:05,  2.13it/s] 88%|████████▊ | 82/93 [00:39<00:05,  2.16it/s] 89%|████████▉ | 83/93 [00:39<00:04,  2.12it/s] 90%|█████████ | 84/93 [00:40<00:04,  2.18it/s] 91%|█████████▏| 85/93 [00:40<00:03,  2.25it/s] 92%|█████████▏| 86/93 [00:41<00:03,  2.22it/s] 94%|█████████▎| 87/93 [00:41<00:02,  2.27it/s] 95%|█████████▍| 88/93 [00:41<00:02,  2.36it/s] 96%|█████████▌| 89/93 [00:42<00:01,  2.22it/s] 97%|█████████▋| 90/93 [00:42<00:01,  2.17it/s] 98%|█████████▊| 91/93 [00:43<00:00,  2.07it/s] 99%|█████████▉| 92/93 [00:43<00:00,  2.09it/s]100%|██████████| 93/93 [00:44<00:00,  2.18it/s]100%|██████████| 93/93 [00:44<00:00,  2.09it/s]
***** predict_test_nl_NL metrics *****
  predict_ex_match_acc         =     0.5941
  predict_ex_match_acc_stderr  =      0.009
  predict_intent_acc           =     0.8584
  predict_intent_acc_stderr    =     0.0064
  predict_loss                 =     0.4583
  predict_runtime              = 0:00:45.08
  predict_samples              =       2974
  predict_samples_per_second   =     65.959
  predict_slot_micro_f1        =     0.6931
  predict_slot_micro_f1_stderr =     0.0032
  predict_steps_per_second     =      2.063
02/05/2024 23:00:26 - INFO - __main__ - *** test_pl_PL ***
[INFO|trainer.py:718] 2024-02-05 23:00:26,995 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 23:00:26,998 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 23:00:26,998 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 23:00:26,998 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:16,  5.52it/s]  3%|▎         | 3/93 [00:00<00:25,  3.49it/s]  4%|▍         | 4/93 [00:01<00:30,  2.96it/s]  5%|▌         | 5/93 [00:01<00:38,  2.29it/s]  6%|▋         | 6/93 [00:02<00:36,  2.39it/s]  8%|▊         | 7/93 [00:02<00:35,  2.43it/s]  9%|▊         | 8/93 [00:03<00:39,  2.14it/s] 10%|▉         | 9/93 [00:03<00:39,  2.10it/s] 11%|█         | 10/93 [00:04<00:37,  2.21it/s] 12%|█▏        | 11/93 [00:04<00:38,  2.15it/s] 13%|█▎        | 12/93 [00:04<00:34,  2.33it/s] 14%|█▍        | 13/93 [00:05<00:33,  2.36it/s] 15%|█▌        | 14/93 [00:05<00:32,  2.45it/s] 16%|█▌        | 15/93 [00:06<00:32,  2.36it/s] 17%|█▋        | 16/93 [00:06<00:32,  2.35it/s] 18%|█▊        | 17/93 [00:07<00:32,  2.36it/s] 19%|█▉        | 18/93 [00:07<00:30,  2.43it/s] 20%|██        | 19/93 [00:07<00:31,  2.35it/s] 22%|██▏       | 20/93 [00:08<00:32,  2.24it/s] 23%|██▎       | 21/93 [00:08<00:31,  2.29it/s] 24%|██▎       | 22/93 [00:09<00:29,  2.38it/s] 25%|██▍       | 23/93 [00:09<00:29,  2.41it/s] 26%|██▌       | 24/93 [00:09<00:27,  2.49it/s] 27%|██▋       | 25/93 [00:10<00:27,  2.50it/s] 28%|██▊       | 26/93 [00:10<00:26,  2.51it/s] 29%|██▉       | 27/93 [00:11<00:25,  2.60it/s] 30%|███       | 28/93 [00:11<00:26,  2.42it/s] 31%|███       | 29/93 [00:12<00:28,  2.23it/s] 32%|███▏      | 30/93 [00:12<00:26,  2.34it/s] 33%|███▎      | 31/93 [00:12<00:26,  2.37it/s] 34%|███▍      | 32/93 [00:13<00:25,  2.43it/s] 35%|███▌      | 33/93 [00:13<00:24,  2.41it/s] 37%|███▋      | 34/93 [00:14<00:22,  2.57it/s] 38%|███▊      | 35/93 [00:14<00:23,  2.50it/s] 39%|███▊      | 36/93 [00:14<00:25,  2.27it/s] 40%|███▉      | 37/93 [00:15<00:26,  2.13it/s] 41%|████      | 38/93 [00:15<00:25,  2.17it/s] 42%|████▏     | 39/93 [00:16<00:23,  2.28it/s] 43%|████▎     | 40/93 [00:16<00:24,  2.20it/s] 44%|████▍     | 41/93 [00:18<00:49,  1.06it/s] 45%|████▌     | 42/93 [00:19<00:42,  1.20it/s] 46%|████▌     | 43/93 [00:19<00:34,  1.46it/s] 47%|████▋     | 44/93 [00:20<00:30,  1.62it/s] 48%|████▊     | 45/93 [00:20<00:27,  1.76it/s] 49%|████▉     | 46/93 [00:21<00:24,  1.90it/s] 51%|█████     | 47/93 [00:21<00:21,  2.12it/s] 52%|█████▏    | 48/93 [00:21<00:19,  2.28it/s] 53%|█████▎    | 49/93 [00:22<00:19,  2.26it/s] 54%|█████▍    | 50/93 [00:22<00:18,  2.32it/s] 55%|█████▍    | 51/93 [00:23<00:18,  2.32it/s] 56%|█████▌    | 52/93 [00:23<00:17,  2.39it/s] 57%|█████▋    | 53/93 [00:23<00:16,  2.49it/s] 58%|█████▊    | 54/93 [00:24<00:15,  2.54it/s] 59%|█████▉    | 55/93 [00:24<00:14,  2.55it/s] 60%|██████    | 56/93 [00:25<00:15,  2.32it/s] 61%|██████▏   | 57/93 [00:25<00:15,  2.32it/s] 62%|██████▏   | 58/93 [00:26<00:14,  2.37it/s] 63%|██████▎   | 59/93 [00:26<00:14,  2.42it/s] 65%|██████▍   | 60/93 [00:26<00:13,  2.37it/s] 66%|██████▌   | 61/93 [00:27<00:14,  2.24it/s] 67%|██████▋   | 62/93 [00:27<00:13,  2.29it/s] 68%|██████▊   | 63/93 [00:28<00:12,  2.38it/s] 69%|██████▉   | 64/93 [00:28<00:11,  2.52it/s] 70%|██████▉   | 65/93 [00:28<00:11,  2.40it/s] 71%|███████   | 66/93 [00:29<00:12,  2.20it/s] 72%|███████▏  | 67/93 [00:29<00:11,  2.26it/s] 73%|███████▎  | 68/93 [00:30<00:10,  2.37it/s] 74%|███████▍  | 69/93 [00:30<00:10,  2.33it/s] 75%|███████▌  | 70/93 [00:32<00:21,  1.09it/s] 76%|███████▋  | 71/93 [00:33<00:17,  1.28it/s] 77%|███████▋  | 72/93 [00:33<00:14,  1.46it/s] 78%|███████▊  | 73/93 [00:34<00:11,  1.67it/s] 80%|███████▉  | 74/93 [00:34<00:10,  1.90it/s] 81%|████████  | 75/93 [00:34<00:08,  2.06it/s] 82%|████████▏ | 76/93 [00:35<00:08,  2.06it/s] 83%|████████▎ | 77/93 [00:35<00:08,  1.91it/s] 84%|████████▍ | 78/93 [00:36<00:07,  2.05it/s] 85%|████████▍ | 79/93 [00:36<00:06,  2.01it/s] 86%|████████▌ | 80/93 [00:37<00:06,  2.14it/s] 87%|████████▋ | 81/93 [00:37<00:05,  2.00it/s] 88%|████████▊ | 82/93 [00:38<00:05,  2.12it/s] 89%|████████▉ | 83/93 [00:38<00:04,  2.22it/s] 90%|█████████ | 84/93 [00:38<00:03,  2.43it/s] 91%|█████████▏| 85/93 [00:39<00:03,  2.46it/s] 92%|█████████▏| 86/93 [00:39<00:02,  2.40it/s] 94%|█████████▎| 87/93 [00:40<00:02,  2.30it/s] 95%|█████████▍| 88/93 [00:40<00:02,  2.33it/s] 96%|█████████▌| 89/93 [00:41<00:01,  2.41it/s] 97%|█████████▋| 90/93 [00:41<00:01,  2.12it/s] 98%|█████████▊| 91/93 [00:42<00:00,  2.08it/s] 99%|█████████▉| 92/93 [00:42<00:00,  2.15it/s]100%|██████████| 93/93 [00:43<00:00,  2.29it/s]100%|██████████| 93/93 [00:43<00:00,  2.15it/s]
***** predict_test_pl_PL metrics *****
  predict_ex_match_acc         =     0.5901
  predict_ex_match_acc_stderr  =      0.009
  predict_intent_acc           =     0.8521
  predict_intent_acc_stderr    =     0.0065
  predict_loss                 =     0.3363
  predict_runtime              = 0:00:43.71
  predict_samples              =       2974
  predict_samples_per_second   =     68.026
  predict_slot_micro_f1        =     0.7076
  predict_slot_micro_f1_stderr =     0.0035
  predict_steps_per_second     =      2.127
02/05/2024 23:01:10 - INFO - __main__ - *** test_pt_PT ***
[INFO|trainer.py:718] 2024-02-05 23:01:10,962 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 23:01:10,964 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 23:01:10,965 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 23:01:10,965 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:19,  4.70it/s]  3%|▎         | 3/93 [00:01<00:32,  2.74it/s]  4%|▍         | 4/93 [00:01<00:34,  2.57it/s]  5%|▌         | 5/93 [00:01<00:35,  2.48it/s]  6%|▋         | 6/93 [00:02<00:41,  2.07it/s]  8%|▊         | 7/93 [00:02<00:40,  2.13it/s]  9%|▊         | 8/93 [00:03<00:40,  2.08it/s] 10%|▉         | 9/93 [00:03<00:40,  2.09it/s] 11%|█         | 10/93 [00:04<00:39,  2.11it/s] 12%|█▏        | 11/93 [00:05<00:42,  1.92it/s] 13%|█▎        | 12/93 [00:05<00:39,  2.06it/s] 14%|█▍        | 13/93 [00:05<00:39,  2.04it/s] 15%|█▌        | 14/93 [00:06<00:36,  2.17it/s] 16%|█▌        | 15/93 [00:06<00:35,  2.19it/s] 17%|█▋        | 16/93 [00:07<00:36,  2.11it/s] 18%|█▊        | 17/93 [00:07<00:39,  1.92it/s] 19%|█▉        | 18/93 [00:08<00:36,  2.03it/s] 20%|██        | 19/93 [00:08<00:38,  1.92it/s] 22%|██▏       | 20/93 [00:09<00:38,  1.89it/s] 23%|██▎       | 21/93 [00:09<00:36,  1.99it/s] 24%|██▎       | 22/93 [00:10<00:33,  2.12it/s] 25%|██▍       | 23/93 [00:10<00:33,  2.09it/s] 26%|██▌       | 24/93 [00:11<00:31,  2.20it/s] 27%|██▋       | 25/93 [00:11<00:33,  2.02it/s] 28%|██▊       | 26/93 [00:12<00:31,  2.16it/s] 29%|██▉       | 27/93 [00:12<00:34,  1.92it/s] 30%|███       | 28/93 [00:13<00:35,  1.85it/s] 31%|███       | 29/93 [00:13<00:33,  1.94it/s] 32%|███▏      | 30/93 [00:14<00:30,  2.10it/s] 33%|███▎      | 31/93 [00:14<00:28,  2.19it/s] 34%|███▍      | 32/93 [00:15<00:28,  2.12it/s] 35%|███▌      | 33/93 [00:15<00:30,  1.97it/s] 37%|███▋      | 34/93 [00:16<00:31,  1.88it/s] 38%|███▊      | 35/93 [00:18<00:58,  1.01s/it] 39%|███▊      | 36/93 [00:19<00:49,  1.16it/s] 40%|███▉      | 37/93 [00:19<00:41,  1.35it/s] 41%|████      | 38/93 [00:19<00:36,  1.51it/s] 42%|████▏     | 39/93 [00:20<00:32,  1.64it/s] 43%|████▎     | 40/93 [00:20<00:31,  1.69it/s] 44%|████▍     | 41/93 [00:21<00:29,  1.77it/s] 45%|████▌     | 42/93 [00:21<00:26,  1.94it/s] 46%|████▌     | 43/93 [00:22<00:25,  1.99it/s] 47%|████▋     | 44/93 [00:22<00:25,  1.92it/s] 48%|████▊     | 45/93 [00:23<00:25,  1.88it/s] 49%|████▉     | 46/93 [00:24<00:25,  1.81it/s] 51%|█████     | 47/93 [00:24<00:23,  1.97it/s] 52%|█████▏    | 48/93 [00:24<00:21,  2.05it/s] 53%|█████▎    | 49/93 [00:25<00:21,  2.03it/s] 54%|█████▍    | 50/93 [00:25<00:21,  2.04it/s] 55%|█████▍    | 51/93 [00:26<00:20,  2.01it/s] 56%|█████▌    | 52/93 [00:27<00:23,  1.73it/s] 57%|█████▋    | 53/93 [00:27<00:23,  1.69it/s] 58%|█████▊    | 54/93 [00:28<00:22,  1.72it/s] 59%|█████▉    | 55/93 [00:28<00:20,  1.87it/s] 60%|██████    | 56/93 [00:29<00:19,  1.89it/s] 61%|██████▏   | 57/93 [00:29<00:17,  2.02it/s] 62%|██████▏   | 58/93 [00:30<00:17,  1.99it/s] 63%|██████▎   | 59/93 [00:30<00:16,  2.10it/s] 65%|██████▍   | 60/93 [00:31<00:15,  2.17it/s] 66%|██████▌   | 61/93 [00:31<00:16,  1.94it/s] 67%|██████▋   | 62/93 [00:32<00:15,  1.97it/s] 68%|██████▊   | 63/93 [00:32<00:16,  1.85it/s] 69%|██████▉   | 64/93 [00:33<00:15,  1.88it/s] 70%|██████▉   | 65/93 [00:33<00:14,  1.91it/s] 71%|███████   | 66/93 [00:34<00:13,  1.97it/s] 72%|███████▏  | 67/93 [00:34<00:13,  1.98it/s] 73%|███████▎  | 68/93 [00:35<00:12,  2.01it/s] 74%|███████▍  | 69/93 [00:35<00:11,  2.08it/s] 75%|███████▌  | 70/93 [00:36<00:10,  2.13it/s] 76%|███████▋  | 71/93 [00:36<00:10,  2.15it/s] 77%|███████▋  | 72/93 [00:37<00:09,  2.24it/s] 78%|███████▊  | 73/93 [00:37<00:08,  2.23it/s] 80%|███████▉  | 74/93 [00:37<00:08,  2.20it/s] 81%|████████  | 75/93 [00:38<00:07,  2.28it/s] 82%|████████▏ | 76/93 [00:38<00:07,  2.14it/s] 83%|████████▎ | 77/93 [00:39<00:07,  2.19it/s] 84%|████████▍ | 78/93 [00:39<00:07,  2.06it/s] 85%|████████▍ | 79/93 [00:40<00:06,  2.05it/s] 86%|████████▌ | 80/93 [00:40<00:06,  2.10it/s] 87%|████████▋ | 81/93 [00:41<00:05,  2.17it/s] 88%|████████▊ | 82/93 [00:41<00:05,  2.01it/s] 89%|████████▉ | 83/93 [00:42<00:04,  2.11it/s] 90%|█████████ | 84/93 [00:42<00:04,  2.15it/s] 91%|█████████▏| 85/93 [00:43<00:03,  2.25it/s] 92%|█████████▏| 86/93 [00:43<00:03,  2.29it/s] 94%|█████████▎| 87/93 [00:43<00:02,  2.24it/s] 95%|█████████▍| 88/93 [00:44<00:02,  2.20it/s] 96%|█████████▌| 89/93 [00:44<00:01,  2.19it/s] 97%|█████████▋| 90/93 [00:45<00:01,  2.03it/s] 98%|█████████▊| 91/93 [00:45<00:00,  2.09it/s] 99%|█████████▉| 92/93 [00:46<00:00,  2.05it/s]100%|██████████| 93/93 [00:46<00:00,  2.13it/s]100%|██████████| 93/93 [00:47<00:00,  1.96it/s]
***** predict_test_pt_PT metrics *****
  predict_ex_match_acc         =     0.5723
  predict_ex_match_acc_stderr  =     0.0091
  predict_intent_acc           =     0.8514
  predict_intent_acc_stderr    =     0.0065
  predict_loss                 =     0.4385
  predict_runtime              = 0:00:47.86
  predict_samples              =       2974
  predict_samples_per_second   =     62.134
  predict_slot_micro_f1        =     0.6632
  predict_slot_micro_f1_stderr =     0.0033
  predict_steps_per_second     =      1.943
02/05/2024 23:01:59 - INFO - __main__ - *** test_ru_RU ***
[INFO|trainer.py:718] 2024-02-05 23:01:59,083 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 23:01:59,085 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 23:01:59,086 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 23:01:59,086 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:17,  5.32it/s]  3%|▎         | 3/93 [00:00<00:25,  3.59it/s]  4%|▍         | 4/93 [00:01<00:31,  2.78it/s]  5%|▌         | 5/93 [00:01<00:34,  2.57it/s]  6%|▋         | 6/93 [00:02<00:32,  2.67it/s]  8%|▊         | 7/93 [00:02<00:36,  2.37it/s]  9%|▊         | 8/93 [00:02<00:34,  2.49it/s] 10%|▉         | 9/93 [00:03<00:32,  2.58it/s] 11%|█         | 10/93 [00:03<00:32,  2.59it/s] 12%|█▏        | 11/93 [00:04<00:34,  2.40it/s] 13%|█▎        | 12/93 [00:04<00:34,  2.38it/s] 14%|█▍        | 13/93 [00:04<00:32,  2.49it/s] 15%|█▌        | 14/93 [00:05<00:31,  2.51it/s] 16%|█▌        | 15/93 [00:05<00:34,  2.27it/s] 17%|█▋        | 16/93 [00:06<00:36,  2.09it/s] 18%|█▊        | 17/93 [00:06<00:37,  2.03it/s] 19%|█▉        | 18/93 [00:07<00:35,  2.12it/s] 20%|██        | 19/93 [00:07<00:32,  2.25it/s] 22%|██▏       | 20/93 [00:08<00:31,  2.32it/s] 23%|██▎       | 21/93 [00:08<00:29,  2.46it/s] 24%|██▎       | 22/93 [00:08<00:28,  2.48it/s] 25%|██▍       | 23/93 [00:09<00:29,  2.33it/s] 26%|██▌       | 24/93 [00:09<00:28,  2.42it/s] 27%|██▋       | 25/93 [00:10<00:30,  2.24it/s] 28%|██▊       | 26/93 [00:10<00:28,  2.31it/s] 29%|██▉       | 27/93 [00:11<00:29,  2.23it/s] 30%|███       | 28/93 [00:13<01:01,  1.06it/s] 31%|███       | 29/93 [00:13<00:49,  1.29it/s] 32%|███▏      | 30/93 [00:14<00:41,  1.50it/s] 33%|███▎      | 31/93 [00:14<00:36,  1.69it/s] 34%|███▍      | 32/93 [00:15<00:37,  1.63it/s] 35%|███▌      | 33/93 [00:15<00:33,  1.81it/s] 37%|███▋      | 34/93 [00:16<00:31,  1.88it/s] 38%|███▊      | 35/93 [00:16<00:29,  1.93it/s] 39%|███▊      | 36/93 [00:16<00:27,  2.08it/s] 40%|███▉      | 37/93 [00:17<00:27,  2.03it/s] 41%|████      | 38/93 [00:18<00:27,  1.98it/s] 42%|████▏     | 39/93 [00:18<00:27,  1.98it/s] 43%|████▎     | 40/93 [00:18<00:25,  2.10it/s] 44%|████▍     | 41/93 [00:19<00:23,  2.19it/s] 45%|████▌     | 42/93 [00:19<00:21,  2.35it/s] 46%|████▌     | 43/93 [00:20<00:19,  2.51it/s] 47%|████▋     | 44/93 [00:20<00:19,  2.52it/s] 48%|████▊     | 45/93 [00:20<00:19,  2.49it/s] 49%|████▉     | 46/93 [00:21<00:19,  2.40it/s] 51%|█████     | 47/93 [00:21<00:18,  2.52it/s] 52%|█████▏    | 48/93 [00:22<00:20,  2.22it/s] 53%|█████▎    | 49/93 [00:22<00:18,  2.38it/s] 54%|█████▍    | 50/93 [00:23<00:19,  2.26it/s] 55%|█████▍    | 51/93 [00:23<00:19,  2.10it/s] 56%|█████▌    | 52/93 [00:24<00:21,  1.94it/s] 57%|█████▋    | 53/93 [00:24<00:18,  2.11it/s] 58%|█████▊    | 54/93 [00:25<00:18,  2.16it/s] 59%|█████▉    | 55/93 [00:25<00:16,  2.32it/s] 60%|██████    | 56/93 [00:25<00:15,  2.33it/s] 61%|██████▏   | 57/93 [00:26<00:15,  2.28it/s] 62%|██████▏   | 58/93 [00:26<00:15,  2.22it/s] 63%|██████▎   | 59/93 [00:27<00:14,  2.28it/s] 65%|██████▍   | 60/93 [00:27<00:14,  2.34it/s] 66%|██████▌   | 61/93 [00:28<00:14,  2.26it/s] 67%|██████▋   | 62/93 [00:28<00:13,  2.30it/s] 68%|██████▊   | 63/93 [00:28<00:12,  2.39it/s] 69%|██████▉   | 64/93 [00:29<00:12,  2.40it/s] 70%|██████▉   | 65/93 [00:29<00:11,  2.46it/s] 71%|███████   | 66/93 [00:30<00:11,  2.42it/s] 72%|███████▏  | 67/93 [00:30<00:10,  2.43it/s] 73%|███████▎  | 68/93 [00:30<00:10,  2.34it/s] 74%|███████▍  | 69/93 [00:31<00:10,  2.24it/s] 75%|███████▌  | 70/93 [00:31<00:10,  2.21it/s] 76%|███████▋  | 71/93 [00:32<00:10,  2.10it/s] 77%|███████▋  | 72/93 [00:32<00:09,  2.21it/s] 78%|███████▊  | 73/93 [00:33<00:08,  2.34it/s] 80%|███████▉  | 74/93 [00:33<00:08,  2.32it/s] 81%|████████  | 75/93 [00:34<00:07,  2.30it/s] 82%|████████▏ | 76/93 [00:34<00:07,  2.24it/s] 83%|████████▎ | 77/93 [00:35<00:07,  2.18it/s] 84%|████████▍ | 78/93 [00:35<00:07,  1.99it/s] 85%|████████▍ | 79/93 [00:36<00:07,  1.96it/s] 86%|████████▌ | 80/93 [00:36<00:06,  1.98it/s] 87%|████████▋ | 81/93 [00:37<00:05,  2.14it/s] 88%|████████▊ | 82/93 [00:37<00:05,  2.17it/s] 89%|████████▉ | 83/93 [00:37<00:04,  2.18it/s] 90%|█████████ | 84/93 [00:38<00:03,  2.27it/s] 91%|█████████▏| 85/93 [00:38<00:03,  2.30it/s] 92%|█████████▏| 86/93 [00:39<00:03,  2.22it/s] 94%|█████████▎| 87/93 [00:39<00:02,  2.24it/s] 95%|█████████▍| 88/93 [00:40<00:02,  2.28it/s] 96%|█████████▌| 89/93 [00:40<00:01,  2.28it/s] 97%|█████████▋| 90/93 [00:40<00:01,  2.28it/s] 98%|█████████▊| 91/93 [00:41<00:00,  2.09it/s] 99%|█████████▉| 92/93 [00:41<00:00,  2.18it/s]100%|██████████| 93/93 [00:42<00:00,  2.35it/s]100%|██████████| 93/93 [00:42<00:00,  2.19it/s]
***** predict_test_ru_RU metrics *****
  predict_ex_match_acc         =     0.6513
  predict_ex_match_acc_stderr  =     0.0087
  predict_intent_acc           =     0.8652
  predict_intent_acc_stderr    =     0.0063
  predict_loss                 =     0.2806
  predict_runtime              = 0:00:42.98
  predict_samples              =       2974
  predict_samples_per_second   =      69.18
  predict_slot_micro_f1        =     0.7584
  predict_slot_micro_f1_stderr =     0.0033
  predict_steps_per_second     =      2.163
02/05/2024 23:02:42 - INFO - __main__ - *** test_tr_TR ***
[INFO|trainer.py:718] 2024-02-05 23:02:42,317 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 23:02:42,320 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 23:02:42,320 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 23:02:42,320 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:18,  4.90it/s]  3%|▎         | 3/93 [00:00<00:25,  3.58it/s]  4%|▍         | 4/93 [00:01<00:29,  3.03it/s]  5%|▌         | 5/93 [00:01<00:31,  2.79it/s]  6%|▋         | 6/93 [00:01<00:30,  2.87it/s]  8%|▊         | 7/93 [00:02<00:32,  2.65it/s]  9%|▊         | 8/93 [00:02<00:31,  2.73it/s] 10%|▉         | 9/93 [00:03<00:35,  2.38it/s] 11%|█         | 10/93 [00:03<00:34,  2.44it/s] 12%|█▏        | 11/93 [00:04<00:35,  2.34it/s] 13%|█▎        | 12/93 [00:04<00:34,  2.37it/s] 14%|█▍        | 13/93 [00:04<00:33,  2.38it/s] 15%|█▌        | 14/93 [00:05<00:33,  2.38it/s] 16%|█▌        | 15/93 [00:05<00:36,  2.15it/s] 17%|█▋        | 16/93 [00:06<00:37,  2.07it/s] 18%|█▊        | 17/93 [00:07<00:38,  1.98it/s] 19%|█▉        | 18/93 [00:07<00:35,  2.12it/s] 20%|██        | 19/93 [00:07<00:33,  2.21it/s] 22%|██▏       | 20/93 [00:08<00:31,  2.30it/s] 23%|██▎       | 21/93 [00:08<00:32,  2.24it/s] 24%|██▎       | 22/93 [00:09<00:34,  2.05it/s] 25%|██▍       | 23/93 [00:09<00:34,  2.02it/s] 26%|██▌       | 24/93 [00:10<00:31,  2.22it/s] 27%|██▋       | 25/93 [00:10<00:29,  2.27it/s] 28%|██▊       | 26/93 [00:10<00:27,  2.41it/s] 29%|██▉       | 27/93 [00:11<00:26,  2.47it/s] 30%|███       | 28/93 [00:13<00:58,  1.11it/s] 31%|███       | 29/93 [00:13<00:47,  1.33it/s] 32%|███▏      | 30/93 [00:14<00:42,  1.49it/s] 33%|███▎      | 31/93 [00:14<00:34,  1.79it/s] 34%|███▍      | 32/93 [00:14<00:31,  1.92it/s] 35%|███▌      | 33/93 [00:15<00:30,  1.99it/s] 37%|███▋      | 34/93 [00:15<00:26,  2.19it/s] 38%|███▊      | 35/93 [00:16<00:26,  2.15it/s] 39%|███▊      | 36/93 [00:16<00:25,  2.23it/s] 40%|███▉      | 37/93 [00:17<00:25,  2.22it/s] 41%|████      | 38/93 [00:17<00:26,  2.08it/s] 42%|████▏     | 39/93 [00:18<00:25,  2.13it/s] 43%|████▎     | 40/93 [00:18<00:23,  2.25it/s] 44%|████▍     | 41/93 [00:18<00:22,  2.32it/s] 45%|████▌     | 42/93 [00:19<00:21,  2.41it/s] 46%|████▌     | 43/93 [00:19<00:20,  2.45it/s] 47%|████▋     | 44/93 [00:20<00:20,  2.39it/s] 48%|████▊     | 45/93 [00:20<00:21,  2.25it/s] 49%|████▉     | 46/93 [00:21<00:21,  2.18it/s] 51%|█████     | 47/93 [00:21<00:19,  2.37it/s] 52%|█████▏    | 48/93 [00:21<00:19,  2.32it/s] 53%|█████▎    | 49/93 [00:22<00:18,  2.43it/s] 54%|█████▍    | 50/93 [00:22<00:16,  2.56it/s] 55%|█████▍    | 51/93 [00:23<00:18,  2.31it/s] 56%|█████▌    | 52/93 [00:23<00:19,  2.11it/s] 57%|█████▋    | 53/93 [00:24<00:17,  2.22it/s] 58%|█████▊    | 54/93 [00:24<00:16,  2.30it/s] 59%|█████▉    | 55/93 [00:24<00:15,  2.43it/s] 60%|██████    | 56/93 [00:25<00:15,  2.45it/s] 61%|██████▏   | 57/93 [00:25<00:14,  2.50it/s] 62%|██████▏   | 58/93 [00:26<00:13,  2.56it/s] 63%|██████▎   | 59/93 [00:26<00:13,  2.60it/s] 65%|██████▍   | 60/93 [00:26<00:13,  2.45it/s] 66%|██████▌   | 61/93 [00:27<00:14,  2.17it/s] 67%|██████▋   | 62/93 [00:27<00:14,  2.12it/s] 68%|██████▊   | 63/93 [00:28<00:13,  2.25it/s] 69%|██████▉   | 64/93 [00:28<00:12,  2.31it/s] 70%|██████▉   | 65/93 [00:29<00:11,  2.38it/s] 71%|███████   | 66/93 [00:29<00:10,  2.47it/s] 72%|███████▏  | 67/93 [00:29<00:10,  2.47it/s] 73%|███████▎  | 68/93 [00:30<00:10,  2.30it/s] 74%|███████▍  | 69/93 [00:30<00:11,  2.17it/s] 75%|███████▌  | 70/93 [00:31<00:10,  2.19it/s] 76%|███████▋  | 71/93 [00:33<00:20,  1.10it/s] 77%|███████▋  | 72/93 [00:33<00:15,  1.33it/s] 78%|███████▊  | 73/93 [00:34<00:12,  1.56it/s] 80%|███████▉  | 74/93 [00:34<00:11,  1.71it/s] 81%|████████  | 75/93 [00:34<00:09,  1.86it/s] 82%|████████▏ | 76/93 [00:35<00:08,  1.97it/s] 83%|████████▎ | 77/93 [00:35<00:07,  2.19it/s] 84%|████████▍ | 78/93 [00:36<00:06,  2.17it/s] 85%|████████▍ | 79/93 [00:36<00:06,  2.15it/s] 86%|████████▌ | 80/93 [00:37<00:06,  2.09it/s] 87%|████████▋ | 81/93 [00:37<00:05,  2.27it/s] 88%|████████▊ | 82/93 [00:37<00:04,  2.34it/s] 89%|████████▉ | 83/93 [00:38<00:04,  2.32it/s] 90%|█████████ | 84/93 [00:38<00:03,  2.34it/s] 91%|█████████▏| 85/93 [00:39<00:03,  2.38it/s] 92%|█████████▏| 86/93 [00:39<00:03,  2.32it/s] 94%|█████████▎| 87/93 [00:40<00:02,  2.41it/s] 95%|█████████▍| 88/93 [00:40<00:02,  2.45it/s] 96%|█████████▌| 89/93 [00:40<00:01,  2.33it/s] 97%|█████████▋| 90/93 [00:41<00:01,  2.34it/s] 98%|█████████▊| 91/93 [00:41<00:00,  2.21it/s] 99%|█████████▉| 92/93 [00:42<00:00,  2.21it/s]100%|██████████| 93/93 [00:42<00:00,  2.40it/s]100%|██████████| 93/93 [00:42<00:00,  2.17it/s]
***** predict_test_tr_TR metrics *****
  predict_ex_match_acc         =     0.5521
  predict_ex_match_acc_stderr  =     0.0091
  predict_intent_acc           =     0.8285
  predict_intent_acc_stderr    =     0.0069
  predict_loss                 =     0.5882
  predict_runtime              = 0:00:43.26
  predict_samples              =       2974
  predict_samples_per_second   =     68.745
  predict_slot_micro_f1        =     0.6555
  predict_slot_micro_f1_stderr =     0.0037
  predict_steps_per_second     =       2.15
02/05/2024 23:03:25 - INFO - __main__ - *** test_vi_VN ***
[INFO|trainer.py:718] 2024-02-05 23:03:25,808 >> The following columns in the test set don't have a corresponding argument in `MT5ForConditionalGeneration.forward` and have been ignored: id, intent_str, annot_utt. If id, intent_str, annot_utt are not expected by `MT5ForConditionalGeneration.forward`,  you can safely ignore this message.
[INFO|trainer.py:3199] 2024-02-05 23:03:25,811 >> ***** Running Prediction *****
[INFO|trainer.py:3201] 2024-02-05 23:03:25,811 >>   Num examples = 2974
[INFO|trainer.py:3204] 2024-02-05 23:03:25,811 >>   Batch size = 32
  0%|          | 0/93 [00:00<?, ?it/s]  2%|▏         | 2/93 [00:00<00:23,  3.92it/s]  3%|▎         | 3/93 [00:01<00:35,  2.55it/s]  4%|▍         | 4/93 [00:01<00:42,  2.10it/s]  5%|▌         | 5/93 [00:02<00:44,  1.99it/s]  6%|▋         | 6/93 [00:02<00:41,  2.08it/s]  8%|▊         | 7/93 [00:03<00:45,  1.87it/s]  9%|▊         | 8/93 [00:03<00:43,  1.96it/s] 10%|▉         | 9/93 [00:04<00:42,  1.97it/s] 11%|█         | 10/93 [00:04<00:42,  1.95it/s] 12%|█▏        | 11/93 [00:05<00:48,  1.69it/s] 13%|█▎        | 12/93 [00:06<00:46,  1.72it/s] 14%|█▍        | 13/93 [00:06<00:45,  1.77it/s] 15%|█▌        | 14/93 [00:07<00:44,  1.76it/s] 16%|█▌        | 15/93 [00:07<00:47,  1.64it/s] 17%|█▋        | 16/93 [00:08<00:50,  1.54it/s] 18%|█▊        | 17/93 [00:09<00:49,  1.52it/s] 19%|█▉        | 18/93 [00:09<00:48,  1.56it/s] 20%|██        | 19/93 [00:10<00:44,  1.67it/s] 22%|██▏       | 20/93 [00:10<00:41,  1.76it/s] 23%|██▎       | 21/93 [00:11<00:39,  1.82it/s] 24%|██▎       | 22/93 [00:12<00:38,  1.85it/s] 25%|██▍       | 23/93 [00:12<00:39,  1.75it/s] 26%|██▌       | 24/93 [00:13<00:39,  1.73it/s] 27%|██▋       | 25/93 [00:13<00:40,  1.66it/s] 28%|██▊       | 26/93 [00:14<00:37,  1.77it/s] 29%|██▉       | 27/93 [00:14<00:38,  1.73it/s] 30%|███       | 28/93 [00:17<01:08,  1.05s/it] 31%|███       | 29/93 [00:17<00:58,  1.10it/s] 32%|███▏      | 30/93 [00:18<00:52,  1.19it/s] 33%|███▎      | 31/93 [00:18<00:46,  1.34it/s] 34%|███▍      | 32/93 [00:19<00:41,  1.47it/s] 35%|███▌      | 33/93 [00:20<00:38,  1.57it/s] 37%|███▋      | 34/93 [00:20<00:35,  1.64it/s] 38%|███▊      | 35/93 [00:21<00:36,  1.61it/s] 39%|███▊      | 36/93 [00:21<00:34,  1.63it/s] 40%|███▉      | 37/93 [00:22<00:33,  1.66it/s] 41%|████      | 38/93 [00:23<00:34,  1.61it/s] 42%|████▏     | 39/93 [00:25<00:56,  1.04s/it] 43%|████▎     | 40/93 [00:25<00:47,  1.11it/s] 44%|████▍     | 41/93 [00:26<00:41,  1.27it/s] 45%|████▌     | 42/93 [00:26<00:36,  1.38it/s] 46%|████▌     | 43/93 [00:27<00:33,  1.48it/s] 47%|████▋     | 44/93 [00:27<00:31,  1.56it/s] 48%|████▊     | 45/93 [00:28<00:29,  1.62it/s] 49%|████▉     | 46/93 [00:29<00:30,  1.56it/s] 51%|█████     | 47/93 [00:29<00:27,  1.65it/s] 52%|█████▏    | 48/93 [00:30<00:29,  1.53it/s] 53%|█████▎    | 49/93 [00:30<00:26,  1.67it/s] 54%|█████▍    | 50/93 [00:31<00:25,  1.67it/s] 55%|█████▍    | 51/93 [00:33<00:43,  1.03s/it] 56%|█████▌    | 52/93 [00:34<00:38,  1.06it/s] 57%|█████▋    | 53/93 [00:34<00:33,  1.21it/s] 58%|█████▊    | 54/93 [00:35<00:28,  1.37it/s] 59%|█████▉    | 55/93 [00:35<00:24,  1.55it/s] 60%|██████    | 56/93 [00:36<00:22,  1.64it/s] 61%|██████▏   | 57/93 [00:36<00:21,  1.71it/s] 62%|██████▏   | 58/93 [00:37<00:21,  1.65it/s] 63%|██████▎   | 59/93 [00:37<00:19,  1.74it/s] 65%|██████▍   | 60/93 [00:38<00:19,  1.72it/s] 66%|██████▌   | 61/93 [00:39<00:19,  1.67it/s] 67%|██████▋   | 62/93 [00:39<00:18,  1.65it/s] 68%|██████▊   | 63/93 [00:40<00:17,  1.74it/s] 69%|██████▉   | 64/93 [00:40<00:15,  1.86it/s] 70%|██████▉   | 65/93 [00:41<00:14,  1.88it/s] 71%|███████   | 66/93 [00:41<00:14,  1.85it/s] 72%|███████▏  | 67/93 [00:42<00:14,  1.80it/s] 73%|███████▎  | 68/93 [00:43<00:14,  1.69it/s] 74%|███████▍  | 69/93 [00:43<00:15,  1.53it/s] 75%|███████▌  | 70/93 [00:44<00:14,  1.56it/s] 76%|███████▋  | 71/93 [00:46<00:23,  1.05s/it] 77%|███████▋  | 72/93 [00:47<00:19,  1.09it/s] 78%|███████▊  | 73/93 [00:47<00:16,  1.24it/s] 80%|███████▉  | 74/93 [00:48<00:14,  1.36it/s] 81%|████████  | 75/93 [00:50<00:20,  1.12s/it] 82%|████████▏ | 76/93 [00:51<00:17,  1.00s/it] 83%|████████▎ | 77/93 [00:51<00:14,  1.08it/s] 84%|████████▍ | 78/93 [00:52<00:13,  1.13it/s] 85%|████████▍ | 79/93 [00:53<00:11,  1.24it/s] 86%|████████▌ | 80/93 [00:53<00:09,  1.34it/s] 87%|████████▋ | 81/93 [00:54<00:08,  1.46it/s] 88%|████████▊ | 82/93 [00:54<00:07,  1.56it/s] 89%|████████▉ | 83/93 [00:55<00:07,  1.38it/s] 90%|█████████ | 84/93 [00:56<00:06,  1.50it/s] 91%|█████████▏| 85/93 [00:56<00:05,  1.55it/s] 92%|█████████▏| 86/93 [00:57<00:04,  1.62it/s] 94%|█████████▎| 87/93 [00:57<00:03,  1.75it/s] 95%|█████████▍| 88/93 [00:58<00:02,  1.91it/s] 96%|█████████▌| 89/93 [00:58<00:02,  1.93it/s] 97%|█████████▋| 90/93 [00:59<00:01,  1.90it/s] 98%|█████████▊| 91/93 [01:00<00:01,  1.75it/s] 99%|█████████▉| 92/93 [01:00<00:00,  1.71it/s]100%|██████████| 93/93 [01:01<00:00,  1.75it/s]100%|██████████| 93/93 [01:01<00:00,  1.51it/s]
***** predict_test_vi_VN metrics *****
  predict_ex_match_acc         =     0.5215
  predict_ex_match_acc_stderr  =     0.0092
  predict_intent_acc           =     0.7774
  predict_intent_acc_stderr    =     0.0076
  predict_loss                 =      0.325
  predict_runtime              = 0:01:02.07
  predict_samples              =       2974
  predict_samples_per_second   =     47.909
  predict_slot_micro_f1        =     0.6561
  predict_slot_micro_f1_stderr =     0.0029
  predict_steps_per_second     =      1.498