Benjamin-png commited on
Commit
15f0ff3
1 Parent(s): bb5aebf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -15
README.md CHANGED
@@ -21,19 +21,8 @@ You can check out the code and process used in the fine-tuning by visiting the [
21
 
22
  You can load and use the model directly from the Hugging Face model hub using either the `pipeline` API or by manually downloading the model and tokenizer.
23
 
24
- ### 1. Using the `pipeline` API
25
 
26
- ```python
27
- from transformers import pipeline
28
-
29
- # Load the fine-tuned model
30
- tts = pipeline("text-to-speech", model="Benjamin-png/swahili-mms-tts-finetuned")
31
-
32
- # Generate speech from text
33
- speech = tts("Habari, karibu kwenye mfumo wetu wa kusikiliza kwa Kiswahili.")
34
- ```
35
-
36
- ### 2. Download and Run the Model Directly
37
 
38
  You can also download the model and tokenizer manually and run the text-to-speech pipeline without the Hugging Face `pipeline` helper. Here's how:
39
 
@@ -41,8 +30,8 @@ You can also download the model and tokenizer manually and run the text-to-speec
41
  import torch
42
  import numpy as np
43
  import scipy.io.wavfile
44
- from transformers import AutoTokenizer
45
- from vits_model import VitsModel # Assuming VitsModel is the class for this TTS model
46
 
47
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
48
  model_name = "Benjamin-png/swahili-mms-tts-finetuned"
@@ -67,6 +56,21 @@ output_np = output.squeeze().cpu().numpy()
67
  scipy.io.wavfile.write(audio_file_path, rate=model.config.sampling_rate, data=output_np)
68
  ```
69
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
  ### Saving and Playing the Audio
71
 
72
  To save and play the audio, you can use the same methods mentioned above:
@@ -103,7 +107,7 @@ pip install torch transformers numpy soundfile scipy pydub
103
 
104
  If you're interested in reproducing the fine-tuning process or using the model for similar purposes, you can check out the Google Colab notebook that outlines the entire process:
105
 
106
- - [Google Colab Notebook](upload file to Google Drive and provide the link here)
107
 
108
  The notebook includes detailed steps on how to fine-tune the MMS model for Swahili TTS.
109
 
 
21
 
22
  You can load and use the model directly from the Hugging Face model hub using either the `pipeline` API or by manually downloading the model and tokenizer.
23
 
 
24
 
25
+ ### 1. Download and Run the Model Directly
 
 
 
 
 
 
 
 
 
 
26
 
27
  You can also download the model and tokenizer manually and run the text-to-speech pipeline without the Hugging Face `pipeline` helper. Here's how:
28
 
 
30
  import torch
31
  import numpy as np
32
  import scipy.io.wavfile
33
+ from transformers import VitsModel, AutoTokenizer
34
+
35
 
36
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
37
  model_name = "Benjamin-png/swahili-mms-tts-finetuned"
 
56
  scipy.io.wavfile.write(audio_file_path, rate=model.config.sampling_rate, data=output_np)
57
  ```
58
 
59
+
60
+ ### 2. Using the `pipeline` API
61
+
62
+ ```python
63
+ from transformers import pipeline
64
+
65
+ # Load the fine-tuned model
66
+ tts = pipeline("text-to-speech", model="Benjamin-png/swahili-mms-tts-finetuned")
67
+
68
+ # Generate speech from text
69
+ speech = tts("Habari, karibu kwenye mfumo wetu wa kusikiliza kwa Kiswahili.")
70
+ ```
71
+
72
+
73
+
74
  ### Saving and Playing the Audio
75
 
76
  To save and play the audio, you can use the same methods mentioned above:
 
107
 
108
  If you're interested in reproducing the fine-tuning process or using the model for similar purposes, you can check out the Google Colab notebook that outlines the entire process:
109
 
110
+ - [Google Colab Notebook](https://colab.research.google.com/drive/1dK1a814UqDnXnM5Rz6NBmk-vmhdN9M4f#scrollTo=iG6IrVva27uT)
111
 
112
  The notebook includes detailed steps on how to fine-tune the MMS model for Swahili TTS.
113