Update README.md
Browse files
README.md
CHANGED
@@ -78,12 +78,13 @@ The primary goal of this training was to demonstrate that with Spectrum CPT targ
|
|
78 |
This method has an even more pronounced effect on larger models. It is feasible to teach a model a new language by training just a quarter of the available layers.
|
79 |
|
80 |
The model has substantially improved German skills as demonstrated in RAG evaluations and numerous recognized benchmarks. In some English benchmarks, it even surpasses the Qwen2-1.5B-Instruct model.
|
|
|
81 |
|
82 |
Stay tuned for the next big models employing Spectrum CPT!
|
83 |
|
84 |
**NOTE**
|
85 |
|
86 |
-
For the demo,
|
87 |
For productive use, more German tokens can be trained on the SauerkrautLM-1.5b as required in order to teach the model even firmer German while only having a relative influence on the performance of the model (25% of the layers).
|
88 |
The SauerkrautLM-1.5b offers an excellent starting point for this.
|
89 |
|
|
|
78 |
This method has an even more pronounced effect on larger models. It is feasible to teach a model a new language by training just a quarter of the available layers.
|
79 |
|
80 |
The model has substantially improved German skills as demonstrated in RAG evaluations and numerous recognized benchmarks. In some English benchmarks, it even surpasses the Qwen2-1.5B-Instruct model.
|
81 |
+
**Spectrum CPT can efficiently teach a new language to a large language model (LLM) while preserving the majority of its previously acquired knowledge.**
|
82 |
|
83 |
Stay tuned for the next big models employing Spectrum CPT!
|
84 |
|
85 |
**NOTE**
|
86 |
|
87 |
+
For the demo, the performance of the model is sufficient.
|
88 |
For productive use, more German tokens can be trained on the SauerkrautLM-1.5b as required in order to teach the model even firmer German while only having a relative influence on the performance of the model (25% of the layers).
|
89 |
The SauerkrautLM-1.5b offers an excellent starting point for this.
|
90 |
|