pradnya-hf-dev commited on
Commit
b9ae525
1 Parent(s): 7cb4cc6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "en"
3
+ inference: false
4
+ tags:
5
+ - Vocoder
6
+ - HiFIGAN
7
+ - text-to-speech
8
+ - TTS
9
+ - speech-synthesis
10
+ - speechbrain
11
+ license: "apache-2.0"
12
+ datasets:
13
+ - LibriTTS
14
+ ---
15
+
16
+ # Vocoder with HiFIGAN trained on LibriTTS
17
+
18
+ This repository provides all the necessary tools for using a [HiFIGAN](https://arxiv.org/abs/2010.05646) vocoder trained with [LibriTTS](https://www.openslr.org/60/). The sample rate used for the vocoder is 22050 kHz.
19
+
20
+ The pre-trained model takes in input a spectrogram and produces a waveform in output. Typically, a vocoder is used after a TTS model that converts an input text into a spectrogram.
21
+
22
+
23
+ ## Install SpeechBrain
24
+
25
+ ```bash
26
+ pip install speechbrain
27
+ ```
28
+
29
+
30
+ Please notice that we encourage you to read our tutorials and learn more about
31
+ [SpeechBrain](https://speechbrain.github.io).
32
+
33
+ ### Using the Vocoder
34
+
35
+ ```python
36
+ import torch
37
+ from speechbrain.pretrained import HIFIGAN
38
+ hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-libritts-22050Hz", savedir="tmpdir")
39
+ mel_specs = torch.rand(2, 80,298)
40
+
41
+ # Running Vocoder (spectrogram-to-waveform)
42
+ waveforms = hifi_gan.decode_batch(mel_specs)
43
+
44
+ # Save the waveform
45
+ torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)
46
+ ```
47
+
48
+ ### Inference on GPU
49
+ To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
50
+
51
+ ### Training
52
+ The model was trained with SpeechBrain.
53
+ To train it from scratch follow these steps:
54
+ 1. Clone SpeechBrain:
55
+ ```bash
56
+ git clone https://github.com/speechbrain/speechbrain/
57
+ ```
58
+ 2. Install it:
59
+ ```bash
60
+ cd speechbrain
61
+ pip install -r requirements.txt
62
+ pip install -e .
63
+ ```
64
+ 3. Run Training:
65
+ ```bash
66
+ cd recipes/LibriTTS/vocoder/hifigan/
67
+ python train.py hparams/train.yaml --data_folder=/path/to/LibriTTS_data_destination
68
+ ```
69
+
70
+ To change the sample rate for model training go to the `"vocoder/hifigan/hparams/train.yaml"` file and change the value for `sample_rate` as required.