[Feedback]

#1
by Darkknight535 - opened

You know the drill.

Darkknight535 pinned discussion

Hi... I was just wondering why are you still using Llama 3 with smaller context window? Instead of perhaps doing Llama 3.1 or some other foundational model with a higher context window?

llama 3 is better in instruct following and context handling, and Mostly stable and logical,
llama 3.1 due to the context length the quality is degraded and is useless.
(Nemo Mistral finetunes are not stable for instruct following)
Gemma (9 and 27) testing it
btw tell me which one you think is good 9b or 27b in rp?

llama 3 is better in instruct following and context handling, and Mostly stable and logical,
llama 3.1 due to the context length the quality is degraded and is useless.
(Nemo Mistral finetunes are not stable for instruct following)
Gemma (9 and 27) testing it
btw tell me which one you think is good 9b or 27b in rp?

Though I did notice that the models go wonky as you start going over 8k-16k, I thought that was less to do with LlaMA 3.1 being bad and more to do with perhaps insufficient training at those context lengths or something like that...

so thanks for pointing that out. I have been busy playing with new CmdR models, for same reason longer context length, and have not gotten around to testing gemma. I have heard good things though and am looking forward to it.

for gemma models (they're good for writing but not for roleplaying, they pick system prompt as {{user}} said something to {{char}}, like it'll give you response of the prompt as the character.)

command R model is good but gets dumb.. like if {{char}} is injured he/she is still smiling like WTH..(above 10k)

Sign up or log in to comment