File size: 3,539 Bytes
5df80f3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52f9251
5df80f3
 
 
 
 
 
 
 
e87990d
5df80f3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
language:
- en
tags:
- llama-2
- instruct
- instruction
- writing
- story
pipeline_tag: text-generation
license: other
---

# Waxwing-Storytelling-70B-LoRA model card

Waxwing is a storytelling lora for Llama 2 70B.
- Guide the story with Waxwing's turn-based instruction system.
- Tailor the feel of your story using style tags.
- Experience storytelling free of ChatGPT's idiosyncrasies, thanks to a "human-generated" dataset of public domain writing. Waxwing avoids GPT-isms like positivity bias, "bond" emphasis, rushed endings and exaggerated stylistic tics.

Waxwing is available:
- LoRA: As a LoRA on this branch and can be applied at runtime on any variant of the Llama 2 70B base model.
- 16fp model: Merged into the base Llama 2 model, in full precision in the [16fp](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/16fp) branch.
- Quantized for used with Exllama 2:
  - [2.5bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/2.5bpw)
  - [3.0bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/3.0bpw)
  - [4.65bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/4.65bpw)
  - [6.0bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/6.0bpw)
  - [8.0bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/8.0bpw)

By using this model, you take full responsibility for anything done with its outputs.


## Model Details

### Model Description

- **Developed by:** alac
- **Model Type:** QLoRA
- **Finetuned from model:** Llama-2 70B
- **Language(s):** English


### Dataset

Waxwing was trained with a small dataset gathered from public domain writing. The exact dataset will remain private, but the code used to generate prompts and metadata is available on [github](https://github.com/alac/txt_to_dataset).
Upstage's [SOLAR](https://huggingface.co/upstage/SOLAR-0-70b-16bit) model was used to tag the dataset.


### Prompt Template

```
### System:
A chat between a user and a writing assistant.
{context}

### User:
{style tags}
Write a scene where: {events that should happen in the next scene}

### Assistant:
{output}
```
`context` is an optional story synopsis.
`style tags` should be a string along the lines of:
```
Tone: {list of tones}. Writing style: {list of writing styles}.
Written with {slow|medium|fast} pacing, in moment to moment detail, in {abstract|selective|vivid sensory} detail, from a {First|Third Person (Character)} perspective.
```
The exact values it was trained on are in the `dataset_tags.json` file. Anecdotally, it works better with a subset of the style tags used (`Tone: tense`) or with tags that are complementary (`Tone: tense, mysterious. Writing style: dramatic. Written in abstract detail.`). It's unclear how well Waxwing responds to tags that it was not trained on (e.g. 'genre').

For SillyTavern users, the `style tags` work well in the "Author's Note" field at depth 1. User messages should begin with `Write a scene where: `; to continue a scene, just type `continue`. Most testing was done using the [Genesis](https://github.com/SillyTavern/SillyTavern/blob/8e73882c9ba7301c9163befbe445686a79d4a9a8/public/TextGen%20Settings/NovelAI%20(Genesis).settings) preset.


### Training

Waxwing was trained on a single machine with 72GB of VRAM. The training parameters are available in the `training_parameters.json` file of the main branch. The software used to train was FartyPants' [Training_PRO](https://github.com/FartyPants/Training_PRO) extension for the Oobabooga Text Generation WebUI.