szymonrucinski
/

Krakowiak-7B-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

szymonrucinski commited on Oct 26, 2023

Commit

2a408a5

•

1 Parent(s): 56422ed

Update README.md

Files changed (1) hide show

README.md +23 -0

README.md CHANGED Viewed

@@ -1,3 +1,26 @@
 ---
 license: cc-by-sa-4.0
 ---

 ---
 license: cc-by-sa-4.0
+language:
+- pl
 ---
+# Model Card for Krakowiak-v2-7b
+Krakowiak-v2-7b is a state of the art 7.3 billion parameters LLM based on Mistral-7B. It was finetuned for Polish text generation using custom created large corpus of 100K Polish instructions. It uses novel techniques e.g. LORA, adding noise to the embeddings for great preformance. For full details of this model please read our [paper to be released soon](www.example.come)
+## Model Architecture
+Krakowiak-v2-7b is a huge update to [Krakowiak-7b](https://huggingface.co/szymonrucinski/krakowiak-7b) bringing the following improvements:
+- Grouped-Query Attention
+- Sliding-Window Attention
+- Byte-fallback BPE tokenizer
+- Significantly larger training corpus of better quality
+- Improved training pipeline
+- Faster inference
+- No random token generation (generating Russian or Czech text alongside Polish)
+- Significantly higher quality of generated text
+Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.
+## Krakowiak team
+[Szymon Franciszek Ruciński](https://szymonrucinski.pl)