File size: 3,560 Bytes

---
license: cc-by-nc-4.0
base_model: nguyenvulebinh/wav2vec2-base-vietnamese-250h
tags:
- generated_from_trainer
datasets:
- common_voice_11_0
metrics:
- wer
model-index:
- name: model_weight_with_token_110
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: common_voice_11_0
      type: common_voice_11_0
      config: vi
      split: None
      args: vi
    metrics:
    - name: Wer
      type: wer
      value: 0.17328485312410297
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# model_weight_with_token_110

This model is a fine-tuned version of [nguyenvulebinh/wav2vec2-base-vietnamese-250h](https://huggingface.co/nguyenvulebinh/wav2vec2-base-vietnamese-250h) on the common_voice_11_0 dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0688
- Wer: 0.1733

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 40
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch   | Step  | Validation Loss | Wer    |
|:-------------:|:-------:|:-----:|:---------------:|:------:|
| 0.5366        | 1.3928  | 500   | 0.1234          | 0.2107 |
| 0.4976        | 2.7855  | 1000  | 0.1343          | 0.2133 |
| 0.4734        | 4.1783  | 1500  | 0.1109          | 0.2037 |
| 0.4449        | 5.5710  | 2000  | 0.1111          | 0.2061 |
| 0.4194        | 6.9638  | 2500  | 0.1096          | 0.2024 |
| 0.3941        | 8.3565  | 3000  | 0.1231          | 0.1969 |
| 0.3767        | 9.7493  | 3500  | 0.1059          | 0.2002 |
| 0.3853        | 11.1421 | 4000  | 0.0998          | 0.1930 |
| 0.3584        | 12.5348 | 4500  | 0.0892          | 0.1905 |
| 0.3291        | 13.9276 | 5000  | 0.0926          | 0.1899 |
| 0.3279        | 15.3203 | 5500  | 0.0879          | 0.1878 |
| 0.3014        | 16.7131 | 6000  | 0.0831          | 0.1851 |
| 0.2886        | 18.1058 | 6500  | 0.0814          | 0.1857 |
| 0.2949        | 19.4986 | 7000  | 0.0880          | 0.1854 |
| 0.2661        | 20.8914 | 7500  | 0.0782          | 0.1829 |
| 0.2676        | 22.2841 | 8000  | 0.0789          | 0.1806 |
| 0.2663        | 23.6769 | 8500  | 0.0787          | 0.1805 |
| 0.2461        | 25.0696 | 9000  | 0.0788          | 0.1793 |
| 0.2484        | 26.4624 | 9500  | 0.0755          | 0.1804 |
| 0.2452        | 27.8552 | 10000 | 0.0715          | 0.1773 |
| 0.2261        | 29.2479 | 10500 | 0.0705          | 0.1764 |
| 0.2311        | 30.6407 | 11000 | 0.0757          | 0.1770 |
| 0.2195        | 32.0334 | 11500 | 0.0714          | 0.1763 |
| 0.2208        | 33.4262 | 12000 | 0.0697          | 0.1752 |
| 0.2029        | 34.8189 | 12500 | 0.0673          | 0.1744 |
| 0.2228        | 36.2117 | 13000 | 0.0691          | 0.1739 |
| 0.2056        | 37.6045 | 13500 | 0.0678          | 0.1738 |
| 0.2017        | 38.9972 | 14000 | 0.0688          | 0.1733 |


### Framework versions

- Transformers 4.40.2
- Pytorch 2.2.1+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1