ylacombe
/

mms-spa-finetuned-chilean-monospeaker

Transformers.js

Inference Endpoints

Model card Files Files and versions Community

mms-spa-finetuned-chilean-monospeaker / README.md

ylacombe's picture

ylacombe HF staff

Update README.md

0113010 10 months ago

|

history blame contribute delete

2.4 kB

	---
	library_name: transformers
	pipeline_tag: text-to-speech
	tags:
	- transformers.js
	- mms
	- vits
	license: cc-by-nc-4.0
	datasets:
	- ylacombe/google-chilean-spanish
	language:
	- es
	---

	## Model

	This is a finetuned version of the [Spanish version](https://huggingface.co/facebook/mms-tts-spa) of Massively Multilingual Speech (MMS) models, which are light-weight, low-latency TTS models based on the [VITS architecture](https://huggingface.co/docs/transformers/model_doc/vits).

	It was trained in around 20 minutes with as little as 80 to 150 samples, on this [Chilean Spanish dataset](https://huggingface.co/datasets/ylacombe/google-chilean-spanish).

	Training recipe available in this [github repository: ylacombe/finetune-hf-vits](https://github.com/ylacombe/finetune-hf-vits).


	## Usage

	### Transformers

	```python
	from transformers import pipeline
	import scipy

	model_id = "ylacombe/mms-spa-finetuned-chilean-monospeaker"
	synthesiser = pipeline("text-to-speech", model_id) # add device=0 if you want to use a GPU

	speech = synthesiser("Hola, ¿cómo estás hoy?")

	scipy.io.wavfile.write("finetuned_output.wav", rate=speech["sampling_rate"], data=speech["audio"])
	```

	### Transformers.js

	If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using:
	```bash
	npm i @xenova/transformers
	```

	Example: Generate Spanish speech with `ylacombe/mms-spa-finetuned-chilean-monospeaker`.
	```js
	import { pipeline } from '@xenova/transformers';

	// Create a text-to-speech pipeline
	const synthesizer = await pipeline('text-to-speech', 'ylacombe/mms-spa-finetuned-chilean-monospeaker', {
	quantized: false, // Remove this line to use the quantized version (default)
	});

	// Generate speech
	const output = await synthesizer('Hola, ¿cómo estás hoy?');
	console.log(output);
	// {
	// audio: Float32Array(69888) [ ... ],
	// sampling_rate: 16000
	// }
	```

	Optionally, save the audio to a wav file (Node.js):
	```js
	import wavefile from 'wavefile';
	import fs from 'fs';

	const wav = new wavefile.WaveFile();
	wav.fromScratch(1, output.sampling_rate, '32f', output.audio);
	fs.writeFileSync('out.wav', wav.toBuffer());
	```


	<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/6FvN6zFSHGeenWS2-H8xv.wav"></audio>