Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,11 @@ This model was created from a fresh untrained model and has only been trained wi
|
|
27 |
|
28 |
Plus it will run and train on the laptop no problem ! (only with text corpuses the context needs to be low as it will force the gpu to consume memory so small articles only; later after intensive training the context can be re-extended etc:
|
29 |
)
|
|
|
30 |
|
|
|
|
|
|
|
31 |
|
32 |
This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
33 |
|
|
|
27 |
|
28 |
Plus it will run and train on the laptop no problem ! (only with text corpuses the context needs to be low as it will force the gpu to consume memory so small articles only; later after intensive training the context can be re-extended etc:
|
29 |
)
|
30 |
+
This model will be fully swahili speaking despite being adapted from and english speaking model : All training applied will be in swahili or other dialects @
|
31 |
|
32 |
+
undergoing fine tuning stages as well as merging stages and retuning stages ! Searching for instruct datasets in swahili
|
33 |
+
|
34 |
+
this is a super fine tuned model .... but it may be behind other models: in the series : Hence this model is for applying lora adapter found on the hub and other created for other models : once applying a lora , set the model in train mode: model.train() And Train on a previoulsy trained dataset before merging the new lora : make sure the prvious dataset still is inline with the model : Often a lora can nudge the model the wrong way and loose some of its previous training as it applys weights on top of the odel which may net be consistant with your model especially if the lora was not trained for this model (but still for the same series (ie mistral))..
|
35 |
|
36 |
This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
37 |
|