GGUF quants of gghfez/gemma-2-27b-rp-c2-v2 Finetune of the gemma2-27b base model.
All quants have FP16 input tensors + output weights. I found quantizing these degraded the quality significantly.
gemma-2-27b-rp-c2-v2.IQ4_XSl.gguf - fits into 16GB VRAM with 16k context
Changes since V1:
- Filtered junk out of the dataset
- prepended to chatml template (so called gemma_chatml)
I've been using the I14_XSl quant with SillyTavern. The latest SillyTavern has a 'gemma2' template which matches the training, but chatml works fine for me.
Seems to work pretty well with SillyTavern
Prompting
Model has been Instruct tuned with the Gemma_ChatML formatting. A typical input would look like this:
<|im_start|>user Hi there!<|im_end|> <|im_start|>assistant Nice to meet you!<|im_end|> <|im_start|>user Can I ask a question?<|im_end|> <|im_start|>assistant
Training:
Trained on a subset of the synthetic RP dataset from: Sao10K/c2-Logs-Filtered
- Downloads last month
- 89
4-bit