Text-to-Speech
Fairseq
English
audio
Changhan commited on
Commit
40abaa8
1 Parent(s): e47c48c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -0
README.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: fairseq
3
+ task: text-to-speech
4
+ tags:
5
+ - fairseq
6
+ - audio
7
+ - text-to-speech
8
+ language: en
9
+ datasets:
10
+ - ljspeech
11
+ ---
12
+ ## Example to download fastspeech2 from fairseq
13
+
14
+ The following should work with fairseq's most up-to-date version in a google colab:
15
+
16
+ ```python
17
+ from fairseq.checkpoint_utils import load_model_ensemble_and_task_from_hf_hub
18
+ import IPython.display as ipd
19
+ import torch
20
+
21
+ model_ensemble, cfg, task = load_model_ensemble_and_task_from_hf_hub(
22
+ "facebook/tts_transformer-en-ljspeech", arg_overrides={"vocoder": "griffin_lim", "fp16": False}
23
+ )
24
+
25
+ def tokenize(text):
26
+ import g2p_en
27
+ tokenized = g2p_en.G2p()(text)
28
+ tokenized = [{",": "sp", ";": "sp"}.get(p, p) for p in tokenized]
29
+ return " ".join(p for p in tokenized if p.isalnum())
30
+
31
+ text = "Hello, this is a test run."
32
+
33
+ tokenized = tokenize(text)
34
+ sample = {
35
+ "net_input": {
36
+ "src_tokens": task.src_dict.encode_line(tokenized).view(1, -1),
37
+ "src_lengths": torch.Tensor([len(tokenized.split())]).long(),
38
+ "prev_output_tokens": None
39
+ },
40
+ "target_lengths": None,
41
+ "speaker": None,
42
+ }
43
+ generator = task.build_generator(model_ensemble, cfg)
44
+ generation = generator.generate(model_ensemble[0], sample)
45
+ waveform = generation[0]["waveform"]
46
+
47
+ ipd.Audio(waveform, rate=task.sr)
48
+ ```