Pendrokar commited on
Commit
20c3e6b
1 Parent(s): 460e46a

repo link; papers

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -37,6 +37,8 @@ tags:
37
  pipeline_tag: text-to-speech
38
  ---
39
 
 
 
40
  The base model for training other xVASynth's "xVAPitch" type models (v3). Model itself used by the xVATrainer TTS model training app. All created by Dan Ruta.
41
 
42
  When used in xVASynth editor, it is an American Adult Male voice. Default pacing is too fast and has to be adjusted.
@@ -46,4 +48,10 @@ xVAPitch_5820651 model sample: <audio controls>
46
  Your browser does not support the audio element.
47
  </audio>
48
 
 
 
 
 
 
 
49
  Used datasets: Unknown/Non-permissiable data
 
37
  pipeline_tag: text-to-speech
38
  ---
39
 
40
+ GitHub project: https://github.com/DanRuta/xVA-Synth
41
+
42
  The base model for training other xVASynth's "xVAPitch" type models (v3). Model itself used by the xVATrainer TTS model training app. All created by Dan Ruta.
43
 
44
  When used in xVASynth editor, it is an American Adult Male voice. Default pacing is too fast and has to be adjusted.
 
48
  Your browser does not support the audio element.
49
  </audio>
50
 
51
+ xVAPitch model referenced Papers:
52
+ - Multi-head attention with Relative Positional embedding - https://arxiv.org/pdf/1809.04281.pdf
53
+ - Transformer with Relative Potional Encoding- https://arxiv.org/abs/1803.02155
54
+ - SDP - https://arxiv.org/pdf/2106.06103.pdf
55
+ - Spline Flow - https://arxiv.org/abs/1906.04032
56
+
57
  Used datasets: Unknown/Non-permissiable data