ylacombe
/

mms-spa-finetuned-chilean-monospeaker

Transformers.js

Inference Endpoints

Model card Files Files and versions Community

ylacombe HF staff commited on Jan 2

Commit

d462368

•

1 Parent(s): 265d848

Update README.md

Files changed (1) hide show

README.md +28 -5

README.md CHANGED Viewed

@@ -4,14 +4,38 @@ pipeline_tag: text-to-speech
 tags:
 - transformers.js
 - mms
 ---
 ## Usage
 ### Transformers
-(TODO)
 ### Transformers.js
@@ -25,7 +49,7 @@ npm i @xenova/transformers
 import { pipeline } from '@xenova/transformers';
 // Create a text-to-speech pipeline
-const synthesizer = await pipeline('text-to-speech', 'Xenova/mms-tts-spa', {
     quantized: false, // Remove this line to use the quantized version (default)
 });
@@ -49,5 +73,4 @@ fs.writeFileSync('out.wav', wav.toBuffer());
 ```
-<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/6FvN6zFSHGeenWS2-H8xv.wav"></audio>

 tags:
 - transformers.js
 - mms
+- vits
+license: cc-by-nc-4.0
+datasets:
+- ylacombe/google-chilean-spanish
+language:
+- es
 ---
+## Model
+This is a finetuned version of the Spanish version of Massively Multilingual Speech (MMS) models, which are light-weight, low-latency TTS models based on the [VITS architecture](https://huggingface.co/docs/transformers/model_doc/vits).
+It was trained in around **20 minutes** with as little as **80 to 150 samples**, on this [Chilean Spanish dataset](https://huggingface.co/datasets/ylacombe/google-chilean-spanish).
+Training recipe available in this [github repository: **ylacombe/finetune-hf-vits**](https://github.com/ylacombe/finetune-hf-vits).
 ## Usage
 ### Transformers
+```python
+from transformers import pipeline
+import scipy
+model_id = "ylacombe/mms-spa-finetuned-chilean-monospeaker"
+synthesiser = pipeline("text-to-speech", model_id) # add device=0 if you want to use a GPU
+speech = synthesiser("Hola, ¿cómo estás hoy?")
+scipy.io.wavfile.write("finetuned_output.wav", rate=speech["sampling_rate"], data=speech["audio"])
+```
 ### Transformers.js
 import { pipeline } from '@xenova/transformers';
 // Create a text-to-speech pipeline
+const synthesizer = await pipeline('text-to-speech', 'ylacombe/mms-spa-finetuned-chilean-monospeaker', {
     quantized: false, // Remove this line to use the quantized version (default)
 });
 ```
+<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/61b253b7ac5ecaae3d1efe0c/6FvN6zFSHGeenWS2-H8xv.wav"></audio>