Example Error
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument weight in method wrapper___slow_conv2d_forward)
11 generator = task.build_generator(models, cfg)
12 sample = TTSHubInterface.get_model_input(task, "testing one two three")
---> 13 wav, rate = TTSHubInterface.get_prediction(task, model, generator, sample)
14
15 ipd.Audio(wav, rate=rate)
\lib\site-packages\fairseq\models\text_to_speech\hub_interface.py in get_prediction(cls, task, model, generator, sample)
130 @classmethod
131 def get_prediction(cls, task, model, generator, sample) -> Tuple[torch.Tensor, int]:
--> 132 prediction = generator.generate(model, sample)
133 return prediction[0]["waveform"], task.sr
134
\lib\site-packages\torch\autograd\grad_mode.py in decorate_context(*args, **kwargs)
25 def decorate_context(*args, **kwargs):
26 with self.clone():
---> 27 return func(*args, **kwargs)
28 return cast(F, decorate_context)
29
\lib\site-packages\fairseq\speech_generator.py in generate(self, model, sample, has_targ, **kwargs)
160
...
--> 309 return F.conv1d(input, weight, bias, self.stride,
310 self.padding, self.dilation, self.groups)
311
I'm getting a feel for this myself. But something that worked for me was to use the CPU. This model seems to be small from what I see from the files on hugging face. So it seems reasonable to try on a CPU when getting a feel for using it.
My quick temp solution:
line: 'arg_overrides={"vocoder": "hifigan", "fp16": False'
change to: 'arg_overrides={"vocoder": "hifigan", "fp16": False, "cpu": True}'
The line relates to setting up and configuring the model for later inference/generation.
I found this from the documentation at: https://fairseq.readthedocs.io/en/latest/
I noticed that the 'arg_overrides' variable dictionary was similar to what was being used for the fairseq command lines arguments from running fairseq generations from shell/command line.
(see: https://fairseq.readthedocs.io/en/latest/command_line_tools.html#fairseq-generate for a bit more specifics.)
This is likely the best I can help with this at the moment.