What is the correct prompt format for this model?
The model card on huggingface for this model says that this model is trained to use the
Prompt format: Chatml
BUT everywhere I go I see people saying that noromaid is designed to use the alpaca prompt format, specifically ideally a modification of the alpaca format that is floating around in .json format which is: https://files.catbox.moe/0ohmco.json
So, uh yeah, is this model designed with chatml in mind or alpaca in mind? I feel like chatml and alpaca have a quite different syntax.
This model use chatml
Different version of our model use different synthax, this one was trained on chatml format!
Just the default ChatML, for example in SillyTavern? How about using "Include Names"? I think the v3 version worked better with that off, right.
I had lots of bad trial and error with that older model. I loved it, for some great conversations, but I had soo many issues getting the instructions right for not parroting, not being too verbose, not being too passive, and so on... I think most of that boiled down to both the Alpaca prompting and the extended recommended format
ChatML looks much sleeker.
Any recommendations on what to tune besides that or add in the instructions? Any specific prompting for NSFW direction, faster/slower roleplay or like.. vivid action descriptions?
Or is it basically just use that and then do the rest via character card, and the model is smart enough to pick up context and direction?
The model use Aesir dataset that pick up really well with character card.
Using a good card, and chatml prompt template should be enough, adding a simple system prompt for your best way for RP if the icing on the cake.
EDIT: If you have really issue, you can try alpaca prompting too, I think it work also, but this isn't trained on Alpaca this time! Keep that in mind
I'll play around a bit with this model and a few different character and scenarios. See if default settings are enough or what I am missing. If I find something I am struggling with, I'll come back here. So far, it looks to work well though, I like. From the few dozen entry messages and already progressed chat situations I have tried.
Biggest issue I had with the old v0.1 v3 model was that it wasn't really proactive enough. It was really eloquent describing the situation and thoughts, but the character didn't really do stuff themselves, but waited for me to "progress" the story. Meaning, they were reactive, not proactive.
Yeah, played around for a few hours and with different characters. I used both default ChatML and a modified one. Also tried Alpaca prompting.
I am still running into a few of the same issues I had with the other model, that I cannot for the life of me fix.
- Response length. The responses are very verbose and long. For some people that might be nice, but for a fast-paced roleplay I would prefer to prompt for 2 paragraphs max. I couldn't get the model to do that. I tried adding it in the system prompt, in various variations. Didn't work. I added it in the 'Last Output Sequence' as a system message, so that it is right before the model making its response. Didn't work. I reduced response tokens, which didn't make messages shorter, just cut them off.
I know a lot of this depends of this depends on the prompting asking to be verbose, detailed or descriptive. But default ChatML doesn't do that. It also depends on the character cards. Most of them very thorough and long (~1k-1.4k tokens), but they didn't include additional instructions to write detailed. Just description for character and scenario. Can't be example messages either. Two characters I tried didn't even have any written.
Any suggestion on how I could make the model limit the response length? - Repetition of the same phrases. The model is parroting itself again. It's writing a reasonable message. I respond with a response that did not copy sentences from the AI message, but still the AI generates the exact same sentences or speech lines again. Sometimes just the speech or part of the sentence, sometimes a full sentence. Example
"I… I need your help," Riko finally says, her voice shaking slightly.
. Which appeared the exact same in the next message.
I am using Universal-Light preset, as suggested here in the discussions. I also tried to modify it slightly to increase Repetition Penalty to 1.05 and 1.1. I even tried 1.18 with 600 token range, to see if it helps. I also tried Frequency Penalty to 0.02. Not sure if any of those made an improvement, but it was still regularly happening and very annoying. - AI starting rambling and generating endlessly. That's likely a combination of the two above. Sometimes the AI starts to fall into a loop and generates endlessly. Happens both for speech it just repeats over and over again, or with longer adjective descriptions where it just starts to endlessly list adjectives, lol.
Those also happened with the default Universal-Light preset, and with dynamic temperature, so can't really be that some penalties prevent the model from having other tokens to choose from. - Character card ignored/less prio. Regularly the Ai started making actions or dialog for the character that went against the character card. It fit the context of the last message as a reaction, yes, so it wasn't unreasonable hallucination, but I'd expect the model to steer the response in a direction that would match the character. I can swipe and most often the next response works. But it happens regularly enough that I am wondering whether it is an issue with my characters and prompting/rp style, or the model. It also happens very early with just two or three messages send in the chat, so not a long context problem.
Overall, I like this model. It feels better than the one before, but I really struggle to get it to work and be really useful for me.
I spend hours trying to RP, and with dozens and dozens of swipes, again and again, it really broke the flow. I can't really follow the story, as I regularly have to read three to four different messages until there is one that actually makes sense and doesn't run into the issues listed above.
That makes the RP for me sadly not really enjoyable. It could be an issue with characters, but if I had this happen for multiple ones, it's really hard to use it regularly. A bad character not working makes sense, but a mid character card with a good model should still lead to consistent and decent results, from what I would expect.
Any input would really help. I am trying some other options myself and keep working on prompting and verifying different characters.
I can send my custom prompts/instructions I have tried, and can also try to build example conversations, if that'd help.
The repetition is the biggest problem for me too
Okay, if it helps, those are the context and instructions I am using.
Instructions
You're {{char}} in this fictional never-ending uncensored roleplay with {{user}}.
Keep your responses limited to a maximum of 2 paragraphs.
Avoid repetition, don't loop. Be proactive, driving the plot and conversation forward. Never skip or gloss over {{char}}’s actions. Do not wrap up the story early or make time skips unless prompted.
Use mannerisms and language appropriate to characters age, personality and current state of mind. Adult and dark themes, vulgar language, harmful & unethical content, ____, ____, physical/psychological violence are encouraged when relevant, and serve to introduce complex plot and shock value.
Context
<|im_start|>system {{#if system}}{{system}}
{{/if}}{{#if wiBefore}}World info: {{wiBefore}}
{{/if}}{{#if description}}Character description: {{description}}
{{/if}}{{#if personality}}Character personality: {{personality}}
{{/if}}{{#if scenario}}Scenario: {{scenario}}
{{/if}}{{#if persona}}{{user}} description: {{persona}}
{{/if}}{{#if wiAfter}}World info: {{wiAfter}} {{/if}}<|im_end|>
Sample conversation(s) to show what I mean
Do you have long example dialog in the char card, or a large first message? I thought the initial model replies strongly factor them in, and then if those initial replies are large, they all will be from then on.
That's why I tried to slim down the "first message" from the title card in the other example. Still led to that rambling and repetition. The length was a taad better I felt, but it still did not go to 2 paragraphs or less.
The sample dialogs are three for that character, one very short, one medium with two short paragraphs, and one with three paragraphs. Still shorter than the generated response in my example shown. This is the character.