cointegrated
commited on
Commit
•
d65d330
1
Parent(s):
9b9680e
Update README.md
Browse files
README.md
CHANGED
@@ -184,8 +184,8 @@ language:
|
|
184 |
It is a truncated version of [NLLB-200-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) model
|
185 |
(6 layers instead of 12, 512 hidden dimensions instead of 1024) with 175M parameters (131M of which are token embeddings).
|
186 |
|
187 |
-
This model was fine-tuned on the [slone/nllb-200-10M-sample](https://huggingface.co/datasets/slone/nllb-200-10M-sample) subset of
|
188 |
-
with 175 languages, using only the samples with BLASER score above 3.5.
|
189 |
|
190 |
Because of its small size, it is really bad at translation, but can serve as a base model for further fine-tuning for a small number of languages.
|
191 |
It is recommended to prune the vocabulary of this model before fine-tuning, to preserve only the tokens used with the intended languages.
|
|
|
184 |
It is a truncated version of [NLLB-200-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) model
|
185 |
(6 layers instead of 12, 512 hidden dimensions instead of 1024) with 175M parameters (131M of which are token embeddings).
|
186 |
|
187 |
+
This model was fine-tuned on the [slone/nllb-200-10M-sample](https://huggingface.co/datasets/slone/nllb-200-10M-sample) subset of
|
188 |
+
the [NLLB dataset](https://huggingface.co/datasets/allenai/nllb) with 175 languages, using only the samples with BLASER score above 3.5.
|
189 |
|
190 |
Because of its small size, it is really bad at translation, but can serve as a base model for further fine-tuning for a small number of languages.
|
191 |
It is recommended to prune the vocabulary of this model before fine-tuning, to preserve only the tokens used with the intended languages.
|