--- license: cc-by-nc-4.0 base_model: nguyenvulebinh/wav2vec2-base-vi tags: - automatic-speech-recognition - quocanh34/asr_spoken_norm_train_data - generated_from_trainer metrics: - wer model-index: - name: wav2vec2-baseline results: [] --- # wav2vec2-baseline This model is a fine-tuned version of [nguyenvulebinh/wav2vec2-base-vi](https://huggingface.co/nguyenvulebinh/wav2vec2-base-vi) on the QUOCANH34/ASR_SPOKEN_NORM_TRAIN_DATA - NA dataset. It achieves the following results on the evaluation set: - Loss: nan - Wer: 1.0 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 8e-06 - train_batch_size: 32 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 500 - num_epochs: 30.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Wer | |:-------------:|:-----:|:----:|:---------------:|:------:| | No log | 0.43 | 100 | inf | 1.2580 | | No log | 0.85 | 200 | inf | 1.0 | | No log | 1.28 | 300 | inf | 1.0 | | No log | 1.71 | 400 | inf | 1.0 | | 12.36 | 2.14 | 500 | inf | 1.0 | | 12.36 | 2.56 | 600 | inf | 1.0 | | 12.36 | 2.99 | 700 | inf | 1.0 | | 12.36 | 3.42 | 800 | inf | 1.0 | | 12.36 | 3.85 | 900 | inf | 1.0 | | 8.4674 | 4.27 | 1000 | inf | 1.0 | | 8.4674 | 4.7 | 1100 | inf | 1.0 | | 8.4674 | 5.13 | 1200 | inf | 1.0 | | 8.4674 | 5.56 | 1300 | inf | 1.0 | | 8.4674 | 5.98 | 1400 | inf | 1.0 | | 6.9866 | 6.41 | 1500 | inf | 1.0 | | 6.9866 | 6.84 | 1600 | inf | 1.0 | | 6.9866 | 7.26 | 1700 | inf | 1.0 | | 6.9866 | 7.69 | 1800 | inf | 1.0 | | 6.9866 | 8.12 | 1900 | inf | 1.0 | | 6.8089 | 8.55 | 2000 | inf | 1.0 | | 6.8089 | 8.97 | 2100 | inf | 1.0 | | 6.8089 | 9.4 | 2200 | inf | 1.0 | | 6.8089 | 9.83 | 2300 | inf | 1.0 | | 6.8089 | 10.26 | 2400 | inf | 1.0 | | 6.7847 | 10.68 | 2500 | inf | 1.0 | | 6.7847 | 11.11 | 2600 | inf | 1.0 | | 6.7847 | 11.54 | 2700 | inf | 1.0 | | 6.7847 | 11.97 | 2800 | inf | 1.0 | | 6.7847 | 12.39 | 2900 | inf | 1.0 | | 6.7941 | 12.82 | 3000 | inf | 1.0 | | 6.7941 | 13.25 | 3100 | inf | 1.0 | | 6.7941 | 13.68 | 3200 | inf | 1.0 | | 6.7941 | 14.1 | 3300 | inf | 1.0 | | 6.7941 | 14.53 | 3400 | inf | 1.0 | | 6.7956 | 14.96 | 3500 | inf | 1.0 | | 6.7956 | 15.38 | 3600 | inf | 1.0 | | 6.7956 | 15.81 | 3700 | inf | 1.0 | | 6.7956 | 16.24 | 3800 | inf | 1.0 | | 6.7956 | 16.67 | 3900 | inf | 1.0 | | 6.8102 | 17.09 | 4000 | inf | 1.0 | | 6.8102 | 17.52 | 4100 | inf | 1.0 | | 6.8102 | 17.95 | 4200 | inf | 1.0 | | 6.8102 | 18.38 | 4300 | inf | 1.0 | | 6.8102 | 18.8 | 4400 | inf | 1.0 | | 6.7761 | 19.23 | 4500 | inf | 1.0 | | 6.7761 | 19.66 | 4600 | inf | 1.0 | | 6.7761 | 20.09 | 4700 | inf | 1.0 | | 6.7761 | 20.51 | 4800 | inf | 1.0 | | 6.7761 | 20.94 | 4900 | inf | 1.0 | | 6.8063 | 21.37 | 5000 | inf | 1.0 | | 6.8063 | 21.79 | 5100 | inf | 1.0 | | 6.8063 | 22.22 | 5200 | inf | 1.0 | | 6.8063 | 22.65 | 5300 | inf | 1.0 | | 6.8063 | 23.08 | 5400 | inf | 1.0 | | 6.7934 | 23.5 | 5500 | inf | 1.0 | | 6.7934 | 23.93 | 5600 | inf | 1.0 | | 6.7934 | 24.36 | 5700 | inf | 1.0 | | 6.7934 | 24.79 | 5800 | inf | 1.0 | | 6.7934 | 25.21 | 5900 | inf | 1.0 | | 6.7819 | 25.64 | 6000 | inf | 1.0 | | 6.7819 | 26.07 | 6100 | inf | 1.0 | | 6.7819 | 26.5 | 6200 | inf | 1.0 | | 6.7819 | 26.92 | 6300 | inf | 1.0 | | 6.7819 | 27.35 | 6400 | inf | 1.0 | | 6.8278 | 27.78 | 6500 | inf | 1.0 | | 6.8278 | 28.21 | 6600 | inf | 1.0 | | 6.8278 | 28.63 | 6700 | inf | 1.0 | | 6.8278 | 29.06 | 6800 | nan | 1.0 | | 6.8278 | 29.49 | 6900 | nan | 1.0 | | 6.7427 | 29.91 | 7000 | nan | 1.0 | ### Framework versions - Transformers 4.33.0.dev0 - Pytorch 2.0.0 - Datasets 2.14.4 - Tokenizers 0.13.3