szymonrucinski
/

Krakowiak-7B-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

szymonrucinski commited on Jan 10

Commit

26cc4d0

•

1 Parent(s): 9b8ca87

Update README.md

Files changed (1) hide show

README.md +28 -0

README.md CHANGED Viewed

@@ -36,6 +36,34 @@ text = "<s>[INST] Czy warto się uczyć? [/INST]"
 From my experience the temperature value of 0.7 is the best baseline value.
 ## Use a pipeline as a high-level helper
 ```

 From my experience the temperature value of 0.7 is the best baseline value.
+## Optimal text generation
+```
+from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
+import torch
+torch_device = "cuda" if torch.cuda.is_available() else "cpu"
+chat_tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
+messages = [
+    {"role": "user", "content": "Czy warto nauczyć się jeździć na nartach w wieku 25 lat?"},
+]
+chat_tokenized = tokenizer.apply_chat_template(messages, tokenize=False)
+model = AutoModelForCausalLM.from_pretrained("szymonrucinski/krakowiak-v2-7b")
+tokenizer = AutoTokenizer.from_pretrained("szymonrucinski/krakowiak-v2-7b",add_eos_token=True)
+tokenizer.pad_token = tokenizer.eos_token
+beam_outputs = model.generate(
+    **model_inputs,
+    max_new_tokens=1024,
+    num_beams=5,
+    no_repeat_ngram_size=2,
+    num_return_sequences=1,
+    early_stopping=True
+)
+model_inputs = tokenizer(chat_tokenized, return_tensors='pt').to(torch_device)
+print(tokenizer.decode(beam_outputs[0], skip_special_tokens=True))
+```
 ## Use a pipeline as a high-level helper
 ```