repo link; papers
Browse files
README.md
CHANGED
@@ -37,6 +37,8 @@ tags:
|
|
37 |
pipeline_tag: text-to-speech
|
38 |
---
|
39 |
|
|
|
|
|
40 |
The base model for training other xVASynth's "xVAPitch" type models (v3). Model itself used by the xVATrainer TTS model training app. All created by Dan Ruta.
|
41 |
|
42 |
When used in xVASynth editor, it is an American Adult Male voice. Default pacing is too fast and has to be adjusted.
|
@@ -46,4 +48,10 @@ xVAPitch_5820651 model sample: <audio controls>
|
|
46 |
Your browser does not support the audio element.
|
47 |
</audio>
|
48 |
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
Used datasets: Unknown/Non-permissiable data
|
|
|
37 |
pipeline_tag: text-to-speech
|
38 |
---
|
39 |
|
40 |
+
GitHub project: https://github.com/DanRuta/xVA-Synth
|
41 |
+
|
42 |
The base model for training other xVASynth's "xVAPitch" type models (v3). Model itself used by the xVATrainer TTS model training app. All created by Dan Ruta.
|
43 |
|
44 |
When used in xVASynth editor, it is an American Adult Male voice. Default pacing is too fast and has to be adjusted.
|
|
|
48 |
Your browser does not support the audio element.
|
49 |
</audio>
|
50 |
|
51 |
+
xVAPitch model referenced Papers:
|
52 |
+
- Multi-head attention with Relative Positional embedding - https://arxiv.org/pdf/1809.04281.pdf
|
53 |
+
- Transformer with Relative Potional Encoding- https://arxiv.org/abs/1803.02155
|
54 |
+
- SDP - https://arxiv.org/pdf/2106.06103.pdf
|
55 |
+
- Spline Flow - https://arxiv.org/abs/1906.04032
|
56 |
+
|
57 |
Used datasets: Unknown/Non-permissiable data
|