szymonrucinski commited on
Commit
2a408a5
1 Parent(s): 56422ed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md CHANGED
@@ -1,3 +1,26 @@
1
  ---
2
  license: cc-by-sa-4.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-sa-4.0
3
+ language:
4
+ - pl
5
  ---
6
+ # Model Card for Krakowiak-v2-7b
7
+
8
+ Krakowiak-v2-7b is a state of the art 7.3 billion parameters LLM based on Mistral-7B. It was finetuned for Polish text generation using custom created large corpus of 100K Polish instructions. It uses novel techniques e.g. LORA, adding noise to the embeddings for great preformance. For full details of this model please read our [paper to be released soon](www.example.come)
9
+
10
+ ## Model Architecture
11
+
12
+ Krakowiak-v2-7b is a huge update to [Krakowiak-7b](https://huggingface.co/szymonrucinski/krakowiak-7b) bringing the following improvements:
13
+ - Grouped-Query Attention
14
+ - Sliding-Window Attention
15
+ - Byte-fallback BPE tokenizer
16
+ - Significantly larger training corpus of better quality
17
+ - Improved training pipeline
18
+ - Faster inference
19
+ - No random token generation (generating Russian or Czech text alongside Polish)
20
+ - Significantly higher quality of generated text
21
+
22
+ Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.
23
+
24
+ ## Krakowiak team
25
+
26
+ [Szymon Franciszek Ruciński](https://szymonrucinski.pl)