Edit model card

Circassian (Kabardian) ASR Model

This is a fine-tuned model for Automatic Speech Recognition (ASR) in kbd, based on the facebook/w2v-bert-2.0 model.

The model was trained on a combination of the anzorq/kbd_speech (filtered on country=russia) and anzorq/sixuxar_yijiri_mak7 datasets.

Model Details

  • Base Model: facebook/w2v-bert-2.0
  • Language: Kabardian
  • Task: Automatic Speech Recognition (ASR)
  • Datasets: anzorq/kbd_speech, anzorq/sixuxar_yijiri_mak7
  • Training Steps: 5000

Training

The model was fine-tuned using the following training arguments:

TrainingArguments(
   output_dir='output',
   group_by_length=True,
   per_device_train_batch_size=8,
   gradient_accumulation_steps=2,
   evaluation_strategy="steps",
   num_train_epochs=10,
   gradient_checkpointing=True,
   fp16=True,
   save_steps=1000,
   eval_steps=500,
   logging_steps=300,
   learning_rate=5e-5,
   warmup_steps=500,
   save_total_limit=2,
   push_to_hub=True,
   report_to="wandb"
)

Performance

The model's performance during training:

Step Training Loss Validation Loss WER
500 2.859600 inf 0.870362
1000 0.355500 inf 0.703617
1500 0.247100 inf 0.549942
2000 0.196700 inf 0.471762
2500 0.181500 inf 0.361494
3000 0.152200 inf 0.314119
3500 0.135700 inf 0.275146
4000 0.113400 inf 0.252625
4500 0.102900 inf 0.277013
5000 0.078500 inf 0.250175
Downloads last month
5
Safetensors
Model size
606M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train anzorq/w2v-bert-2.0-kbd