Nero10578
/

Mistral-7B-Sunda-v1.0

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Nero10578 commited on Dec 22, 2023

Commit

5054926

•

1 Parent(s): 9cb65ac

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -1,3 +1,13 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+language:
+- su
+- en
+- id
 ---
+This is a fine tune of Mistral-7B-v0.1 on a very limited range of Sundanese language datasets that are available.
+This is a learning project for me where I just wanted to see if it's possible to teach a model a new language that it does not inherently support with just a QLora fine tune. It won't only speak sundanese but it just adds sundanese capability to the model that is to me impressive for the limited data and short amount of training time.
+Datasets used:
+Sundanese sources from this repo. Cleaned and deduped myself.
+https://github.com/w11wo/nlp-datasets