File size: 827 Bytes
e216fe8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
A wav2vec2.0-xlsr-53 model is fine-tuned on a large speech corpus of children aged 3 to 5 years.



---
language: 
  - ko
thumbnail: "url to a thumbnail used in social sharing"
tags:
- automatic-speech-recognition
- speech
datasets:
- AI-HUB
metrics:
- cer
model-index:
- name: xlsr-53-korean_kids_3_to_5_syl
  results:
  - task:
      type: automatic-speech-recognition
    dataset:
      type: AI-HUB         
      name: AI-HUB
      split: test
    metrics:
      - type: cer         
        value: 3.48       
        name: Syllable Error Rate
---


Dataset: children's speech aged 3 to 5 years
- the number of files) train:valid:test=256,548:12,121:12,000
- transcripts were tokenized into morphs by Mecab

EVALUATION results
- Dataset: testset from the corpus
- Performance: syllable Error Rate on the testset = 3.48%