|
A wav2vec2.0-xlsr-53 model is fine-tuned on a large speech corpus of children aged 3 to 5 years. |
|
|
|
|
|
|
|
--- |
|
language: |
|
- ko |
|
thumbnail: "url to a thumbnail used in social sharing" |
|
tags: |
|
- automatic-speech-recognition |
|
- speech |
|
datasets: |
|
- AI-HUB |
|
metrics: |
|
- cer |
|
model-index: |
|
- name: xlsr-53-korean_kids_3_to_5_syl |
|
results: |
|
- task: |
|
type: automatic-speech-recognition |
|
dataset: |
|
type: AI-HUB |
|
name: AI-HUB |
|
split: test |
|
metrics: |
|
- type: cer |
|
value: 3.48 |
|
name: Syllable Error Rate |
|
--- |
|
|
|
|
|
Dataset: children's speech aged 3 to 5 years |
|
- the number of files) train:valid:test=256,548:12,121:12,000 |
|
- transcripts were tokenized into morphs by Mecab |
|
|
|
EVALUATION results |
|
- Dataset: testset from the corpus |
|
- Performance: syllable Error Rate on the testset = 3.48% |
|
|
|
|