Nikolai1902 commited on
Commit
6040c6b
1 Parent(s): 788d2ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -1,3 +1,16 @@
1
  ---
2
  license: wtfpl
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: wtfpl
3
  ---
4
+ This model was a QLoRA of LLaMA 2-13B base finetuned on the FIMFiction archive and then merged with the base model (as most GGML loading apps don't support LoRAs), and quantized for llama.cpp-based frontends. Was trained with 1024 context length
5
+
6
+
7
+ There are two options, depending on the resources you have:
8
+ - Q5_K_M: Low quality loss K-quantized 5 bits model. Max RAM consumption is 11.73 GB, recommended if you have 12GB of VRAM to load 40 layers
9
+ - Q4_K_S: Compact K-quantized 4 bits. Max RAM consumption is 9.87 GB
10
+
11
+ This not an instruction tuned model, it was trained on raw text, so treat it like an autocomplete.
12
+ Seems sensitive to formatting: I found it's usually better at staying on topic when using double spacing in the prompt.
13
+
14
+ Only tags excluded from the dataset are eqg and humanized.
15
+
16
+