khushi1234455687's picture
Upload tokenizer
8867945 verified
metadata
base_model: facebook/wav2vec2-large-xlsr-53
datasets:
  - fleurs
library_name: transformers
license: apache-2.0
metrics:
  - wer
tags:
  - generated_from_trainer
model-index:
  - name: wav2vec2-large-xlsr-53-Hindi-Version1
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: fleurs
          type: fleurs
          config: hi_in
          split: None
          args: hi_in
        metrics:
          - type: wer
            value: 0.5457385531582544
            name: Wer

wav2vec2-large-xlsr-53-Hindi-Version1

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the fleurs dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7287
  • Wer: 0.5457

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 70
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
3.6017 6.7568 500 3.5280 1.0
3.3879 13.5135 1000 3.3755 1.0
3.3566 20.2703 1500 3.3544 1.0
3.3133 27.0270 2000 3.2753 1.0
2.216 33.7838 2500 1.8757 0.9159
1.2972 40.5405 3000 1.0386 0.6969
1.0939 47.2973 3500 0.8590 0.6190
1.0188 54.0541 4000 0.7791 0.5797
0.9468 60.8108 4500 0.7461 0.5575
0.9806 67.5676 5000 0.7287 0.5457

Framework versions

  • Transformers 4.45.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1