Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,28 @@
|
|
1 |
---
|
2 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
+
datasets:
|
4 |
+
- mozilla-foundation/common_voice_13_0
|
5 |
+
language:
|
6 |
+
- ca
|
7 |
+
- ta
|
8 |
+
- th
|
9 |
---
|
10 |
+
|
11 |
+
## About
|
12 |
+
|
13 |
+
Multilingual Distilwhisper allows for better ASR performance in target languages by adding lightweight CLSR modules on top of whisper-small.
|
14 |
+
These modules are trained on a mix of cross-entropy (ASR) and knowledge distillation losses, where whisper-large-v2 is used as teacher.
|
15 |
+
|
16 |
+
## Inference
|
17 |
+
|
18 |
+
Loader will be made available soon at https://github.com/naver
|
19 |
+
|
20 |
+
## Citation (submitted to ICASSP 2024)
|
21 |
+
```
|
22 |
+
@article{ferraz2023distilwhisper,
|
23 |
+
title={DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts},
|
24 |
+
author={Ferraz, Thomas Palmeira and Boito, Marcely Zanon and Brun, Caroline and Nikoulina, Vassilina},
|
25 |
+
journal={arXiv preprint arXiv:2311.01070},
|
26 |
+
year={2023}
|
27 |
+
}
|
28 |
+
```
|